第一版

Web 性能日志,第 2 卷

Web Performance Daybook, Volume 2

斯托扬·斯特凡诺夫

Stoyan Stefanov

由奥莱利媒体出版

北京 ⋅ 剑桥 ⋅ 法纳姆 ⋅ 科隆 ⋅ 塞巴斯托波尔 ⋅ 东京

Beijing ⋅ Cambridge ⋅ Farnham ⋅ Köln ⋅ Sebastopol ⋅ Tokyo

前言

Foreword

史蒂夫 ·苏德斯

Steve Souders

您手中有有史以来发布的最大的网络性能文章集。它包括性能主题,例如开源工具、缓存、移动网络和应用程序、自动化、改善用户体验、HTML5、JavaScript、CSS3、指标、ROI 和网络协议。作者群体多种多样,包括世界上最大的网络公司的员工到独立顾问。贡献者中至少有七家 Web 性能初创公司:Blaze、CloudFlare、Log Normal、Strangeloop、Torbit、Turbobytes 和 Zoompf。主题和贡献者的范围令人印象深刻。但真正令我印象深刻的是,除了日常工作之外,每个贡献者还运行一个或多个开源项目、博客、写书、在会议上发言、组织聚会或运营一个非营利组织。有些人做所有这些。经过一整天在十几个主要浏览器上驯服 JavaScript 或追踪导致页面加载时间激增的回归之后,是什么迫使这些人在他们的“业余时间”为 Web 性能社区做出贡献?以下是我在提出这个问题时收到的一些回复:

In your hands is the largest collection of web performance articles ever published. It includes performance topics such as open source tools, caching, mobile networks and applications, automation, improving the user experience, HTML5, JavaScript, CSS3, metrics, ROI, and network protocols. The collection of authors is diverse including employees of the world’s largest web companies to independent consultants. At least seven web performance startups are represented among the contributors: Blaze, CloudFlare, Log Normal, Strangeloop, Torbit, Turbobytes, and Zoompf. The range of topics and contributors is impressive. But what really impresses me is that, in addition to their day jobs, every contributor also runs one or more open source projects, blogs, writes books, speaks at conferences, organizes meetups, or runs a non-profit. Some do all of these. After a full day of taming JavaScript across a dozen major browsers or tracking down the regression that made page load times spike, what compels these people to contribute back to the web performance community during their “spare time”? Here are some of the responses I’ve received when asking this question:

缺乏正规培训
Lack of Formal Training

我们中的许多从事网络工作的人都是在工作中学习的。网络内容要么不在我们的大学课程中,要么我们所学的内容不适用于我们现在所做的事情。在职培训是一个漫长的过程,涉及大量的试验和错误。分享最佳实践可以提高团队智商,让进入该领域的新人更快地跟上步伐。

Many of us working on the Web learned our craft on the job. Web stuff either wasn’t in our college curriculum or what we did learn isn’t applicable to what we do now. This on the job training is a long process involving a lot of trial and error. Sharing best practices raises the group IQ and lets new people entering the field come up to speed more quickly.

避免重复同样的错误
Avoid Repeating the Same Mistakes

在这个反复试验的过程中会发生错误。有时会发生很多错误。我们都曾经历过在凌晨或连续数天为一个问题绞尽脑汁的经历,常常是在经过漫长的排除过程后才找到解决方案。值得庆幸的是,我们的社区意识不允许我们在看着同龄人陷入同样的​​陷阱时袖手旁观。分享我们找到的解决方案可以让其他人避免犯我们所犯的同样的错误。

Mistakes happen during this trial and error process. Sometimes a lot of mistakes happen. We have all experienced banging our heads against a problem in the wee hours of the morning or for days on end, often stumbling on the solution only after a long process of elimination. Thankfully, our sense of community doesn’t allow us to stand by mutely while we watch our peers heading for the same pitfalls. Sharing the solutions we found lets others avoid the same mistakes we made.

痴迷于优化
Obsessed with Optimization

就其本质而言,开发人员会被优化所吸引。我们都努力使我们的代码最快,我们的算法最高效,我们的架构最具弹性。这种痴迷不仅限于我们的网站;还包括我们的网站。我们希望每个网站都得到优化。做到这一点的最好方法就是分享我们所知道的。

By their nature, developers are drawn to optimization. We all strive to make our code the fastest, our algorithms the most efficient, and our architectures the most resilient. This obsession doesn’t just stop with our website; we want every website to be optimized. The best way to do that is to share what we know.

喜欢帮忙
Like to Help

最后,有些人真的很喜欢帮助别人。当某人意识到自己刚刚节省了一周的工作量或将网站速度提高了一倍时,他们脸上的表情让我们感觉我们帮助了社区的发展。

Finally, some people just really like to help others. That look on someone’s face when they realize they just saved a week of work or made their site twice as fast makes us feel like we’ve helped the community grow.

为了证明这种共享意识,作者将本书的所有版税捐给了 WPO 基金会,这是一个支持 Web 性能社区的非营利组织。因此,您可以享受前面的章节,不仅因为它们是地球上最好的 Web 性能建议,而且还因为它们是无私地提供给 Web 性能社区的。享受!

As a testimony to this sense of sharing, the authors have dedicated all royalties of this book to the WPO Foundation, a non-profit organization that supports the web performance community. Thus, you can enjoy the chapters that lie ahead not only because they are some of the best web performance advice on the planet, but also because it was given to the web performance community selflessly. Enjoy!

来自编辑

From the Editor

斯托扬 ·斯特凡诺夫

Stoyan Stefanov

本着真正的高性能、非阻塞异步交付的精神,您现在拥有了在第 1 卷之前出版的《Web 性能日记》第 2 卷。我希望您会喜欢阅读这本书,就像我喜欢在这本书上工作并进行摩擦一样。 (虚拟)与我们行业中一些最聪明的人并肩。

In the spirit of the true high-performance, non-blocking asynchronous delivery, you now have the Web Performance Daybook, Volume 2 published before Volume 1. I hope you'll enjoy reading the book as much as I enjoyed working on it and rubbing (virtual) shoulders with some of the brightest people in our industry.

早在 2009 年 12 月,我就想概述一下 Web 性能优化 (WPO) 学科。我决定给自己设定一个每天一篇文章的截止日期,从 12 月 1 日到 24 日:类似于http://www.24ways.org的降临节日历的格式。事实证明,连续写 24 篇文章是一个相当大的挑战,因此我很高兴也很感激地接受了几位业内朋友的帮助:Christian Heilmann (Mozilla)、Eric Goldsmith (AOL) 和两篇帖子来自 Ara Pehlivanian(雅虎!)。

Back in December 2009, I wanted to give an overview of the web performance optimization (WPO) discipline. I decided on a self-imposed deadline of an-article-a-day from December 1 to 24: the format of an advent calendar similar to http://www.24ways.org. As it turned out, 24 articles in a row was quite a challenge and so I was happy and grateful to accept the offers for help from a few friends from the industry: Christian Heilmann (Mozilla), Eric Goldsmith (AOL), and two posts from Ara Pehlivanian (Yahoo!).

这些文章受到了社区的热烈欢迎,第二年,即 2010 年 12 月,日历已经成为人们期待阅读的东西。该日历还在http://calendar.perfplanet.com上找到了新家,作为“Planet Performance”提要聚合器的子域。这一次,更多的人愿意提供帮助。我们整个行业的开发人员都愿意贡献自己的时间,分享和传播他们的知识,发布新工具,这样就可以创建比一个人更好的 24 篇文章集。这就是日记本系列的第一卷。

The articles were warmly accepted by the community and then the following year, in December 2010, the calendar was already something people were looking forward to reading. The calendar also got a new home at http://calendar.perfplanet.com as a subdomain of the “Planet Performance” feed aggregator. And this time around more people were willing to help. Developers of all around our industry were willing to contribute their time, to share and spread their knowledge, announce new tools, and this way create a much better set of 24 articles than a single person could. This is what soon will become Volume 1 of the series of Daybooks.

然后到了2011年12月,我们有很多好的内容和热情,我们一直持续到12月24日,一直到12月31日,甚至在最后一天发表了两篇文章。这是您手中以书籍形式提供的内容,如Web Performance Daybook, Volume 2

Then came December 2011, and we had so much good content and enthusiasm that we kept going past December 24, all the way to December 31, even publishing two articles on the last day. This is the content that you have in your hands in a book format as Web Performance Daybook, Volume 2.

我们的 WPO 社区年轻、规模较小,但正在不断成长,需要以降临节日历等社区建设活动的形式获得滋养。这就是为什么我很高兴有机会与 O'Reilly 以及所有 32 位作者合作开发此书。我对结果非常满意,我知道这两本书都将作为未来几年性能工具、研究、技术和方法的参考和介绍。线下技术出版物中总是存在内容过时的风险,但我一直在今天的最新会议中看到对日历文章的引用,因此我相信这些知识将在相当长的一段时间内保持新鲜,其中一些甚至是注定会成为永恒。

Our WPO community is young, small, but growing, and in need of nourishment in the form of community building events such as the advent calendar. That's why it was exciting to have the opportunity to collaborate on this title with O'Reilly and all 32 authors. I'm really happy with the result and I know that both volumes will serve as a reference and introduction to performance tools, research, techniques, and approaches for years to come. There’s always the risk with outdated content in offline technical publications, but I see references to the calendar articles in the latest conferences today all the time, so I'm confident this knowledge is to remain fresh for quite a while and some of it is even destined to become timeless.

享受这本书,准备向业内最聪明的人学习,最重要的是,准备好让网络成为我们所有人更美好的地方!

Enjoy the book, prepare to learn from the brightest in the industry and, most of all, be ready to make the Web a better place for all of us!

关于作者

About the Authors

帕特里克·米南

Patrick Meenan

帕特里克·米南

Patrick Meenan ( http://blog.patrickmeenan.com/ ) (@patmeenan) 在 AOL 工作期间创建了 WebPagetest ( http://www.webpagetest.org/ ),现在在 Google 与致力于使网络速度更快 ( http://code.google.com/speed/ )。

Patrick Meenan (http://blog.patrickmeenan.com/) (@patmeenan) created WebPagetest (http://www.webpagetest.org/) while working at AOL and now works at Google with the team that is working to make the Web faster (http://code.google.com/speed/).

尼古拉斯·扎卡斯

Nicholas Zakas

尼古拉斯·扎卡斯

Nicholas C. Zakas ( http://www.nczonline.net/ ) (@slicknet) 是 WellFurnished 的首席架构师,该网站致力于帮助您寻找美丽的家居装饰。在此之前,他曾在雅虎工作过!在近五年的时间里,他担任 Yahoo! 的演示架构师和前端主管。主页,以及 YUI 库的贡献者。他是《Maintainable JavaScript》(O'Reilly,2012 年)、《Professional JavaScript for Web Developers》 (Wrox,2012 年)、《Professional Ajax》(Wrox,2007 年)和《 High Performance JavaScript》的作者(奥莱利,2010)。Nicholas 是开发最佳实践的坚定倡导者,包括渐进增强、可访问性、性能、可扩展性和可维护性。他定期在http://www.nczonline.net/上发表博客。

Nicholas C. Zakas (http://www.nczonline.net/) (@slicknet) is chief architect of WellFurnished, a site dedicated to helping you find beautiful home decor. Prior to that, he worked at Yahoo! for almost five years, where he was a presentation architect, frontend lead for the Yahoo! homepage, and a contributor to the YUI library. He is the author of Maintainable JavaScript (O’Reilly, 2012), Professional JavaScript for Web Developers (Wrox, 2012), Professional Ajax (Wrox, 2007), and High Performance JavaScript (O’Reilly, 2010). Nicholas is a strong advocate for development best practices including progressive enhancement, accessibility, performance, scalability, and maintainability. He blogs regularly at http://www.nczonline.net/.

盖伊·波贾尼

Guy Podjarny

盖伊·波贾尼

Guy Podjarny ( http://blaze.io/ ) (@guypod) 是 Web 性能和安全专家,专门研究移动 Web 性能,是 Blaze 的首席技术官。Guy 在加入 Blaze 之前的最后十年中担任软件架构师和 Web 应用程序安全专家,推动 IBM Rational AppScan 产品线从诞生到成为领先的 Web 应用程序安全评估工具。Guy 已申请超过 15 项专利,在众多会议上发表演讲,并发表了多篇专业论文。

Guy Podjarny (http://blaze.io/) (@guypod) is Web Performance and Security expert, specializing in Mobile Web Performance, CTO at Blaze. Guy spent the last decade prior to Blaze as a Software Architect and Web Application Security expert, driving the IBM Rational AppScan product line from inception to being the leading Web Application Security assessment tool. Guy has filed over 15 patents, presented at numerous conferences, and has published several professional papers.

斯托扬·斯特凡诺夫

Stoyan Stefanov

斯托扬·斯特凡诺夫

Stoyan Stefanov ( http://phpied.com/ ) (@stoyanstefanov) 是 Facebook 工程师,前 Yahoo! 作家(“JavaScript 模式”、“面向对象的 JavaScript”)、演讲者(JSConf、Velocity、Fronteers)、工具制造者(Smush.it、YSlow 2.0)和吉他英雄崇拜者(http://givepngachance.com/

Stoyan Stefanov (http://phpied.com/) (@stoyanstefanov) is a Facebook engineer, former Yahoo! writer (“JavaScript Patterns”, “Object-Oriented JavaScript”), speaker (JSConf, Velocity, Fronteers), toolmaker (Smush.it, YSlow 2.0), and a Guitar Hero wannabe (http://givepngachance.com/).

蒂姆·卡德莱克

Tim Kadlec

蒂姆·卡德莱克

Tim Kadlec ( http://timkadlec.com ) (@tkadlec) 是一位在威斯康星州北部生活和工作的 Web 开发人员。他与小公司、大型出版商和工业公司合作的多元化背景使他能够看到网络技术的谨慎应用如何影响各种规模的企业。

Tim Kadlec (http://timkadlec.com) (@tkadlec) is web developer living and working in northern Wisconsin. His diverse background working with small companies to large publishers and industrial corporations has allowed him to see how the careful application of web technologies can impact businesses of all sizes.

Tim 组织了 Breaking Development ( http://bdconf.com ),这是一个每年两次的会议,致力于移动设备的网页设计和开发。

Tim organizes Breaking Development (http://bdconf.com), a biannual conference dedicated to web design and development for mobile devices.

他目前正在撰写一本书,题为《实施响应式设计:为随处可见的 Web 构建站点》 ( http://responsiveenhancement.com ),预计将于 2012 年秋季出版。

He is currently writing a book entitled Implementing Responsive Design: Building Sites for an Anywhere, Everywhere Web (http://responsiveenhancement.com), due out in the fall of 2012.

布赖恩·潘恩

Brian Pane

布赖恩·潘恩

Brian Pane ( http://www.brianp.net/ ) (@brianpane) 是一位互联网技术和产品通才。曾就职于迪士尼、CNET、F5、Facebook等公司;一路走来,他抓住了任何让软件变得更快的机会。

Brian Pane (http://www.brianp.net/) (@brianpane) is an Internet technology and product generalist. He has worked at companies including Disney, CNET, F5, and Facebook; and all along the way he’s jumped at any opportunity to make software faster.

乔什·弗雷泽

Josh Fraser

乔什·弗雷泽

Josh Fraser ( http://onlineaspect.com/ ) (@joshfraser) 是 Torbit 的联合创始人兼首席执行官,该公司致力于自动化前端优化,事实证明可以提高网站速度。Josh 毕业于克莱姆森大学,获得计算机科学学士学位,之前创立了一家名为 EventVue 的公司。他目前住在山景城,痴迷于速度。

Josh Fraser (http://onlineaspect.com/) (@joshfraser) is the co-founder and CEO of Torbit, a company that automates front-end optimizations that are proven to increase the speed of your website. Josh graduated from Clemson University with a BS in computer science and previously founded a company called EventVue. He currently lives in Mountain View and is obsessed with speed.

史蒂夫·苏德斯

Steve Souders

史蒂夫·苏德斯

Steve Souders ( http://stevesouders.com/ ) (@souders) 在 Google ( http://www.google.com/ ) 从事网络性能和开源计划工作。他的书《高性能网站》解释了他的最佳性能实践;它在亚马逊计算机和互联网畅销书中排名第一。他的后续著作《更快的网站》,提供了当今 Web 2.0 应用程序的性能技巧。Steve 是 YSlow 的创建者,YSlow 是 Firebug 的性能分析扩展,下载量超过 200 万次。他还创建了 Cuzillion、SpriteMe 和 Browserscope。他担任 O'Reilly 的 Web 性能和运营会议 Velocity 的联合主席,并且是 Firebug 工作组的联合创始人。他在斯坦福大学教授 CS193H:高性能网站,并经常在 OSCON、Ajax Experience、SXSW 和 Web 2.0 Expo 等会议上发表演讲。

Steve Souders (http://stevesouders.com/) (@souders) works at Google (http://www.google.com/) on web performance and open source initiatives. His book, High Performance Web Sites, explains his best practices for performance; it was #1 in Amazon’s Computer and Internet bestsellers. His follow-up book, Even Faster Web Sites, provides performance tips for today’s Web 2.0 applications. Steve is the creator of YSlow, the performance analysis extension to Firebug, with over 2 million downloads. He also created Cuzillion, SpriteMe, and Browserscope. He serves as co-chair of Velocity, the web performance and operations conference from O’Reilly, and is co-founder of the Firebug Working Group. He taught CS193H: High Performance Web Sites at Stanford, and frequently speaks at conferences including OSCON, The Ajax Experience, SXSW, and Web 2.0 Expo.

左贝蒂

Betty Tso

左贝蒂

Betty 是亚马逊的软件开发经理。在此之前,她领导雅虎的卓越性能工程团队!并推动了 Yahoo! 顶级 Web 性能产品(如 YSlow 和 Roundtrip)的工程执行和开发。

Betty is a Software Development Manager at Amazon. Prior to that, she led the Exceptional Performance Engineering team at Yahoo! and drove the engineering execution and development for Yahoo!'s top Web Performance products like YSlow and Roundtrip.

Betty 也是 Web 性能优化领域的布道者。她曾在 Velocity Conferences、Yahoo! 会议上发表演讲。前端峰会,以及佐治亚理工学院、杜克大学、UIUC、德克萨斯大学奥斯汀分校和加州大学圣地亚哥分校等大学。她还是雅虎联合总裁!Women-in-Tech 是一个拥有 600 多名成员的组织,致力于帮助女性在职业生涯中取得成功、促进员工成长并激励年轻女孩追求技术职业。

Betty is also an evangelist in the Web Performance Optimization domain. She has spoken at Velocity Conferences, the Yahoo! Frontend Summit, and universities such as Georgia Tech, Duke, UIUC, University of Texas at Austin, and UCSD. She was also co-President of Yahoo! Women-in-Tech, a 600+ members organization that empowers women to succeed in their career, foster employee growth, and inspire young girls to pursue technical careers.

以色列尼尔

Israel Nir

以色列尼尔

Israel Nir ​​(@shunra) 喜欢创造东西、分解其他东西、编码、数字 0x17 以及演奏尤克里里琴。他还在 Shunra 担任团队领导,负责构建使应用程序运行得更快的工具。

Israel Nir (@shunra) likes to create stuff, break other stuff apart, code, the number 0x17, and playing the ukulele. He also works as a team leader at Shunra, where he builds tools to make applications run faster.

马塞尔·杜兰

Marcel Duran

马塞尔·杜兰

Marcel Duran ( http://javascriptrules.com/ ) 目前是 Twitter, Inc. 的前端工程师。在此之前,他在 Yahoo! 从事高流量网站的 Web 性能优化工作。他在首页和搜索团队中应用和研究了网络性能最佳实践,使页面速度更快。在担任雅虎卓越性能团队前端负责人的最后一个职位上,他致力于 YSlow(现在作为他的个人开源项目)和其他性能工具的开发、研究和传播。

Marcel Duran (http://javascriptrules.com/) is currently a Front End Engineer at Twitter, Inc. Prior to that, he was into web performance optimization on high traffic sites at Yahoo! Front Page and Search teams where he applied and researched web performance best practices making pages even faster. On his last role as the Front End Lead for Yahoo!'s Exceptional Performance Team, he was dedicated to YSlow (now as his personal open source project) and other performance tools development, researches, and evangelism.

埃里克·达斯佩特

Éric Daspet

埃里克·达斯佩特

Éric Daspet ( http://eric.daspet.name/ ) (@edasfr) 是法国的一名网络顾问。他撰写了有关 PHP 的文章,创办了 Paris-Web 会议来提高 Web 质量,现在正在与本地用户组和未来的书籍一起推动性能。

Éric Daspet (http://eric.daspet.name/) (@edasfr) is a web consultant in France. He wrote about PHP, founded Paris-Web conferences to promote web quality, and is now pushing performance with a local user group and a future book.

阿洛伊斯·赖特鲍尔

Alois Reitbauer

阿洛伊斯·赖特鲍尔

Alois Reitbauer ( http://blog.dynatrace.com/ ) (@aloisreitbauer) 担任 dynaTrace 软件的技术策略师,并领导 dynaTrace 卓越中心。作为 dynaTrace Labs 技术的主要贡献者,他影响着公司未来的技术方向。除了工程工作外,他还支持财富 500 强公司实施成功的绩效管理。

Alois Reitbauer (http://blog.dynatrace.com/) (@aloisreitbauer) works as Technology Strategist for dynaTrace software and heads the dynaTrace Center of Excellence. As a major contributor to dynaTrace Labs technology he influences the companies future technological direction. Besides his engineering work, he supports Fortune 500 companies in implementing successful performance management.

马修·普林斯

Matthew Prince

马修·普林斯

Matthew Prince ( http://www.cloudflare.com/ ) (@eastdakota) 是 CloudFlare 的联合创始人兼首席执行官。马修在 7 岁时编写了他的第一个计算机程序,此后一直无法摆脱这个错误。从芝加哥大学法学院毕业后,他当了一天律师,然后抓住机会成为一家科技初创公司的创始成员。他没有回头。CloudFlare 是 Matthew 的第三次创业。另一方面,马修作为兼职教授教授互联网法,是一名经过认证的滑雪教练,也是圣丹斯电影节的常客。

Matthew Prince (http://www.cloudflare.com/) (@eastdakota) is the co-founder & CEO of CloudFlare. Matthew wrote his first computer program when he was 7, and hasn’t been able to shake the bug since. After attending the University of Chicago Law School, he worked as an attorney for one day before jumping at the opportunity to be a founding member of a tech startup. He hasn’t looked back. CloudFlare is Matthew’s third entrepreneurial venture. On the side, Matthew teaches Internet law as an adjunct professor, is a certified ski instructor and regular attendee of the Sundance Film Festival.

巴迪·布鲁尔

Buddy Brewer

巴迪·布鲁尔

Buddy Brewer (@bbrewer) 是 Log Normal 的联合创始人,这家公司可以准确地向您显示真实的人们在您的网站上等待的时间。他从事各种角色的 Web 性能问题研究近十年。

Buddy Brewer (@bbrewer) is a co-founder of Log Normal, a company that shows you exactly how much time real people spend waiting on your website. He has worked on web performance issues in various roles for almost ten years.

亚历山大·波德尔科

Alexander Podelko

亚历山大·波德尔科

在过去的十四年里,Alex Podelko ( http://alexanderpodelko.com/blog/ ) (@apodelko) 在多家公司担任性能工程师和架构师。目前,他是 Oracle 技术顾问顾问,负责 Hyperion 产品的性能测试和优化。Alex 目前担任计算机测量小组 (CMG) 的董事。他维护着一系列与绩效相关的链接和文档。

The last fourteen years Alex Podelko (http://alexanderpodelko.com/blog/) (@apodelko) worked as a performance engineer and architect for several companies. Currently he is Consulting Member of Technical Staff at Oracle, responsible for performance testing and optimization of Hyperion products. Alex currently serves as a director for the Computer Measurement Group (CMG). He maintains a collection of performance-related links and documents.

埃斯特尔·韦尔

Estelle Weyl

埃斯特尔·韦尔

Estelle Weyl ( http://www.standardista.com/ ) (@estellevw) 的职业生涯始于建筑,然后管理青少年健康项目。2000 年,她自然而然地成为了一名网络标准专家。她曾为 Kodakgallery、Yahoo! 担任顾问。和苹果等。Estelle 在她的博客中提供了 CSS3 和 HTML5 浏览器支持的教程和详细网格。她是《Mobile HTML5》 (O'Reilly,2011 年 10 月)和《HTML5 and CSS3 for the Real World》(Sitepoint,2011 年 5 月)的作者。虽然她不从事编码工作,但她从事建筑工作,让她的 20 世纪 60 年代复古住所不再那么嬉皮。

Estelle Weyl (http://www.standardista.com/) (@estellevw) started her professional life in architecture, then managed teen health programs. In 2000, she took the natural step of becoming a web standardista. She has consulted for Kodakgallery, Yahoo! and Apple, among others. Estelle provides tutorials and detailed grids of CSS3 and HTML5 browser support in her blog. She is the author of Mobile HTML5 (O’Reilly, Oct. 2011) and HTML5 and CSS3 for the Real World (Sitepoint, May 2011). While not coding, she works in construction, de-hippifying her 1960s throwback abode.

亚伦·彼得斯

Aaron Peters

亚伦·彼得斯

Aaron Peters ( http://www.aaronpeters.nl/en/ ) (@aaronpeters) 是一位位于荷兰的独立网络性能顾问。他是红辣椒队的粉丝,随时都会在滑雪板比赛中踢你的屁股。

Aaron Peters (http://www.aaronpeters.nl/en/) (@aaronpeters) is an independent web performance consultant based in The Netherlands. He is a Red Hot Chili Peppers fan and will kick your butt in a snowboard contest anytime.

托尼·詹蒂科尔

Tony Gentilcore

托尼·詹蒂科尔

Tony Gentilcore (@tonygentilcore) 是 Google 的软件工程师。他喜欢让 Web 变得更快,并且最近在 Google Chrome/WebKit 中添加了对 Web Timing 和异步脚本的支持。

Tony Gentilcore (@tonygentilcore) is a software engineer at Google. He enjoys making the Web faster and has recently added support for Web Timing and async scripts to Google Chrome/WebKit.

马修·斯蒂尔

Matthew Steele

马修·斯蒂尔

马修·斯蒂尔 (Matthew Steele) 是谷歌的一名软件工程师,致力于让网络变得更快。Matthew 致力于 Firefox 和 Chrome 的 Page Speed 工作,为 mod_pagespeed 做出了贡献,最近领导了 Apache mod_spdy 的设计和开发。

Matthew Steele is a software engineer at Google, working on making the Web faster. Matthew has worked on Page Speed for Firefox and Chrome, has contributed to mod_pagespeed, and most recently, has led design and development of mod_spdy for Apache.

布莱恩·麦奎德

Bryan McQuade

布莱恩·麦奎德

Bryan McQuade (@bryanmcquade) 领导 Google 的 Page Speed 团队。他为各种提高 Web 速度的项目做出了贡献,包括基于 HTTP 的共享字典压缩以及优化 Web 服务器以更好地利用 HTTP。

Bryan McQuade (@bryanmcquade) leads the Page Speed team at Google. He has contributed to various projects that make the Web faster, including Shared Dictionary Compression over HTTP and optimizing web servers to better utilize HTTP.

托比·兰格尔

Tobie Langel

托比·兰格尔

Tobie Langel ( http://tobielangel.com/ ) (@tobie) 是 Facebook 的软件工程师。他还是 Facebook 的 W3C AC 代表。他是一位狂热的开源贡献者 ( https://github.com/tobie ),他因共同维护 Prototype JavaScript 框架而闻名。Tobie 最近再次开始写博客,并在 blog.tobie.me ( http://blog.tobie.me/ ) 上咆哮。前世,他是一名职业爵士鼓手。

Tobie Langel (http://tobielangel.com/) (@tobie) is a Software engineer at Facebook. He’s also Facebook’s W3C AC Rep. An avid open-source contributor (https://github.com/tobie), he’s mostly known for having co-maintained the Prototype JavaScript Framework. Tobie recently picked up blogging again and rants at blog.tobie.me (http://blog.tobie.me/). In a previous life, he was a professional jazz drummer.

比利·霍夫曼

Billy Hoffman

比利·霍夫曼

如果说比利·霍夫曼坚信一件事,那就是透明度。事实上,他曾经因此被起诉,但那是另一回事了。作为 Zoompf 的创始人兼首席执行官,Billy 继续推动透明度,该公司的产品通过识别导致网站速度变慢的具体问题来提供对网站性能的可见性。您可以在 Twitter ( http://twitter.com/zoompf )上关注 Zoompf ,并在 Zoompf 的博客 Lickity Split ( http://zoompf.com/blog )上阅读 Billy 的性能研究。

If there is one thing Billy Hoffman believes in, it’s transparency. In fact, he once got sued over it, but that is another story. Billy continues to push for transparency as founder and CEO of Zoompf, whose products provide visibility into your website’s performance by identifying the specific issues that are slowing your site down. You can follow Zoompf on Twitter (http://twitter.com/zoompf) and read Billy’s performance research on Zoompf’s blog Lickity Split (http://zoompf.com/blog).

约书亚·比克斯比

Joshua Bixby

约书亚·比克斯比

Joshua Bixby (@JoshuaBixby) 是 Strangeloop ( http://www.strangeloopnetworks.com/ ) 的总裁,该公司为 eBay/PayPal、Visa、Petco、Wine.com 和 O'Reilly Media 等公司提供网站加速解决方案。Joshua 还维护博客《今日 Web 性能》( http://www.webperformancetoday.com/ ),该博客探讨有关网站速度、用户行为和性能优化的问题和想法。

Joshua Bixby (@JoshuaBixby) is president of Strangeloop (http://www.strangeloopnetworks.com/), which provides website acceleration solutions to companies like eBay/PayPal, Visa, Petco, Wine.com, and O’Reilly Media. Joshua also maintains the blog Web Performance Today (http://www.webperformancetoday.com/), which explores issues and ideas about site speed, user behavior, and performance optimization.

谢尔盖·切尔内雪夫

Sergey Chernyshev

谢尔盖·切尔内雪夫

Sergey Chernyshev ( http://www.sergeychernyshev.com/ ) (@sergeyche) 组织纽约网络表演聚会,并帮助世界各地的其他表演爱好者在各自的城市举办聚会。Sergey 自愿花时间在 PerfPlanet 网站上运行 @perfplanet Twitter 伴侣。他还是一名开源开发人员,也是一些与 Web 性能相关的工具的作者,包括 ShowSlow、SVN Assets、嵌入式 .htaccess 等。

Sergey Chernyshev (http://www.sergeychernyshev.com/) (@sergeyche) organizes New York Web Performance Meetup and helps other performance enthusiasts around the world start meetups in their cities. Sergey volunteers his time to run @perfplanet Twitter companion to PerfPlanet site. He is also an open source developer and author of a few web performance-related tools including ShowSlow, SVN Assets, drop-in .htaccess, and more.

JP卡斯特罗

JP Castro

JP卡斯特罗

JP Castro (@jphpsf) 是一位居住在旧金山的前端工程师。他热衷于 Web 开发,特别是 Web 性能。他在http://blog.jphpsf.com上发表博客,并共同组织了旧金山表演聚会。当他不谈论表现时,他喜欢与家人共度时光,在户外,喝精酿啤酒,喝一整罐花生酱,玩电子游戏。

JP Castro (@jphpsf) is a frontend engineer living in San Francisco. He’s passionate about web development and specifically web performance. He blogs at http://blog.jphpsf.com and co-organizes the San Francisco performance meetup. When he’s not talking about performance, he enjoys spending time with his family, being outdoors, sipping craft beers, consuming a full jar of Nutella, and playing video games.

帕维尔·保劳

Pavel Paulau

帕维尔·保劳

Pavel Paulau (@pavelpaulau) 是来自白俄罗斯明斯克的性能工程师。除了在 Couchbase ( http://www.couchbase.com ) 的日常工作之外,他还作为 WebPerformance.ru 博客 ( http://webperformance.ru/ ) 的共同作者,努力传播速度的重要性。

Pavel Paulau (@pavelpaulau) is a performance engineer from Minsk, Belarus. Besides his daily work at Couchbase (http://www.couchbase.com), he tries to spread importance of speed as co-author of the WebPerformance.ru blog (http://webperformance.ru/).

大卫·卡尔霍恩

David Calhoun

大卫·卡尔霍恩

David Calhoun (@franksvalli) 是一位独立前端开发人员,他的时间分布在加利福尼亚州和日本之间。他是 JSMag 的社区新闻撰稿人,并拥有一个博客 ( http://davidbcalhoun.com/ ),其中包含开发人员和一般生活想法(很难运用哲学学位......)。

David Calhoun (@franksvalli) is an independent frontend developer who has been splitting his time between California and Japan. He’s the community news writer for JSMag and keeps a blog (http://davidbcalhoun.com/) with developer and general life thoughts (hard to put that philosophy degree to use…).

David 专注于移动、前端性能,当然,还有移动性能。他曾在雅虎工作过!Mobile、CBSi/CNET 偶尔会与 WebMocha 签订合同,目前正在与 Skybox Imaging 签订合同,致力于通过浏览器开发卫星飞行的界面。

David specializes in mobile, frontend performance, and sure enough, mobile performance. He formerly worked for Yahoo! Mobile, CBSi/CNET, occasionally contracts for WebMocha, and is currently contracting at Skybox Imaging, working on interfaces for flying satellites from browsers.

妮可沙利文

Nicole Sullivan

妮可沙利文

Nicole Sullivan ( http://stubbornella.org/ ) (@stubbornella) 是一位传播者、前端性能顾问、CSS Ninja 和作家。她启动了面向对象的 CSS 开源项目,该项目回答了以下问题:如何为数百万访问者或数千个页面扩展 CSS?她还为 W3C 的测试版重新设计提供了咨询,并且是 Smush.it(云端图像优化服务)的共同创建者。

Nicole Sullivan (http://stubbornella.org/) (@stubbornella) is an evangelist, frontend performance consultant, CSS Ninja, and author. She started the Object-Oriented CSS open source project, which answers the question: how do you scale CSS for millions of visitors or thousands of pages? She also consulted with the W3C for their beta redesign, and is the co-creator of Smush.it, an image optimization service in the cloud.

Nicole 对大型商业网站的 CSS、Web 标准和可扩展前端架构充满热情。她在世界各地的会议上谈论性能,最近一次是在 The Ajax Experience、ParisWeb 和 Web Directions North 上。她与人合着了Even Faster Websites和 blog at Stubbornella.org。

Nicole is passionate about CSS, web standards, and scalable frontend architecture for large commercial websites. She speaks about performance at conferences around the world, most recently at The Ajax Experience, ParisWeb, and Web Directions North. She co-authored Even Faster Websites and blogs at stubbornella.org.

詹姆斯·皮尔斯

James Pearce

詹姆斯·皮尔斯

James ( http://tripleodeon.com/ ) (@jamespearce) 是 Facebook 移动开发者关系主管。他住在加利福尼亚州和世界各地的机场。

James (http://tripleodeon.com/) (@jamespearce) is Head of Mobile Developer Relations at Facebook. He lives in California and in airports around the world.

汤姆·休斯·克劳彻

Tom Hughes-Croucher

汤姆·休斯·克劳彻

Tom ( http://tomhughescroucher.com/ ) (@sh1mmer) 是 Jetpacks for Dinosaurs 的首席顾问,这有助于使网站变得非常快。Tom 为沃尔玛和 MySpace 等客户提供咨询。Tom 是一位行业资深人士,曾就职于 Yahoo!、Joyent、NASA、Tesco 等公司。Tom 是《Up and Running with Node.js》的合著者,现居住在加利福尼亚州旧金山。

Tom (http://tomhughescroucher.com/) (@sh1mmer) is the principal consultant at Jetpacks for Dinosaurs, which helps make websites really rather fast. Tom consults with clients like Walmart and MySpace to name a few. An industry veteran, Tom has worked for the likes of Yahoo!, Joyent, NASA, Tesco, and many more. Tom co-authored Up and Running with Node.js and lives in San Francisco, CA.

戴夫·阿茨

Dave Artz

戴夫·阿茨

David Artz 领导 AOL 的站点工程团队。他过去领导过 AOL 的优化团队,该团队专注于在他现在领导的团队中制定标准并开发前端工程、性能和 SEO 方面的最佳实践。在管理多个团队的同时,他继续开发脚本/CSS/字体加载器作为他的 Boot 库 ( https://github.com/artzstudio/Boot ) 的一部分,这是一个用于 jQuery 的 AMD 加载器 ( https://github.com/ artzstudio/jQuery-AMD),以及一个名为 Sonar 的 jQuery 插件(https://github.com/artzstudio/jQuery-Sonar ),用于使用特殊的“ scrollin ”和“ scrollout ”事件轻松按需加载内容和功能。

David Artz leads the Site Engineering team at AOL. He led AOL’s Optimization team in the past—a team focused on setting standards and developing best practices in frontend engineering, performance, and SEO across the teams he now leads. While managing multiple teams, he has continued to develop script/CSS/font loaders as part of his Boot library (https://github.com/artzstudio/Boot), an AMD loader for jQuery (https://github.com/artzstudio/jQuery-AMD), and a jQuery plug-in called Sonar (https://github.com/artzstudio/jQuery-Sonar) for easily loading content and functionality in on demand using special “scrollin” and “scrollout” events.

前言

Preface

本书中使用的约定

Conventions Used in This Book

本书使用以下印刷约定:

The following typographical conventions are used in this book:

斜体
Italic

表示新术语、URL、电子邮件地址、文件名和文件扩展名。

Indicates new terms, URLs, email addresses, filenames, and file extensions.

Constant width
Constant width

用于程序列表,以及在段落中引用程序元素,例如变量或函数名称、数据库、数据类型、环境变量、语句和关键字。

Used for program listings, as well as within paragraphs to refer to program elements such as variable or function names, databases, data types, environment variables, statements, and keywords.

Constant width bold
Constant width bold

显示应由用户逐字键入的命令或其他文本。

Shows commands or other text that should be typed literally by the user.

Constant width italic
Constant width italic

显示应替换为用户提供的值或上下文确定的值的文本。

Shows text that should be replaced with user-supplied values or by values determined by context.

提示

Tip

此图标表示提示、建议或一般说明。

This icon signifies a tip, suggestion, or general note.

警告

Caution

该图标表示警告或注意。

This icon indicates a warning or caution.

使用代码示例

Using Code Examples

本书旨在帮助您完成工作。一般来说,您可以在您的程序和文档中使用本书中的代码。除非您要复制大部分代码,否则您无需联系我们以获得许可。例如,编写使用本书中的几段代码的程序不需要许可。销售或分发 O'Reilly 书籍中示例的 CD-ROM 确实需要许可。通过引用本书和示例代码来回答问题不需要许可。将本书中的大量示例代码合并到您的产品文档中确实需要许可。

This book is here to help you get your job done. In general, you may use the code in this book in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing a CD-ROM of examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a significant amount of example code from this book into your product’s documentation does require permission.

我们赞赏但不要求归属。归属通常包括标题、作者、出版商和 ISBN。例如:“ Web Performance Daybook,第二卷,由 Stoyan Stefanov 编辑(O'Reilly)。版权所有 2012 Stoyan Stefanov,978-1-449-33291-4。”

We appreciate, but do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example: “Web Performance Daybook, Volume Two edited by Stoyan Stefanov (O’Reilly). Copyright 2012 Stoyan Stefanov, 978-1-449-33291-4.”

如果您认为您对代码示例的使用不符合合理使用或上述许可的范围,请随时通过 与我们联系。

If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at .

Safari® 在线图书

Safari® Books Online

笔记

Note

Safari Books Online ( www.safaribooksonline.com ) 是一个按需数字图书馆,以书籍和视频形式提供来自世界领先技术和商业作者的专家内容。

Safari Books Online (www.safaribooksonline.com) is an on-demand digital library that delivers expert content in both book and video form from the world’s leading authors in technology and business.

技术专业人士、软件开发人员、网页设计师以及商业和创意专业人士使用 Safari Books Online 作为研究、解决问题、学习和认证培训的主要资源。

Technology professionals, software developers, web designers, and business and creative professionals use Safari Books Online as their primary resource for research, problem solving, learning, and certification training.

Safari Books Online 为组织政府机构个人提供一系列产品组合 和定价计划。订阅者可以在一个完全可搜索的数据库中访问来自 O'Reilly Media、Prentice Hall Professional、Addison-Wesley Professional、Microsoft Press、Sams、Que、Peachpit Press、Focal Press、Cisco 等出版商的数千本书籍、培训视频和预出版手稿Press、John Wiley & Sons、Syngress、Morgan Kaufmann、IBM Redbooks、Packt、Adobe Press、FT Press、Apress、Manning、New Riders、McGraw-Hill、Jones & Bartlett、Course Technology。有关 Safari Books Online 的更多信息,请访问我们在线

Safari Books Online offers a range of product mixes and pricing programs for organizations, government agencies, and individuals. Subscribers have access to thousands of books, training videos, and prepublication manuscripts in one fully searchable database from publishers like O’Reilly Media, Prentice Hall Professional, Addison-Wesley Professional, Microsoft Press, Sams, Que, Peachpit Press, Focal Press, Cisco Press, John Wiley & Sons, Syngress, Morgan Kaufmann, IBM Redbooks, Packt, Adobe Press, FT Press, Apress, Manning, New Riders, McGraw-Hill, Jones & Bartlett, Course Technology, and dozens more. For more information about Safari Books Online, please visit us online.

如何联系我们

How to Contact Us

请向出版商提出有关本书的意见和问题:

Please address comments and questions concerning this book to the publisher:

奥莱利媒体公司
格拉文斯坦公路北1005号
塞瓦斯托波尔, CA 95472
800-998-9938(美国或加拿大)
707-829-0515(国际或本地)
707-829-0104(传真)

我们有本书的网页,其中列出了勘误表、示例和任何其他信息。您可以通过以下地址访问此页面:

We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at:

http://oreil.ly/web_perf_daybook_v2

要评论或询问有关本书的技术问题,请发送电子邮件至:

To comment or ask technical questions about this book, send email to:

有关我们的书籍、课程、会议和新闻的更多信息,请访问我们的网站:http://www.oreilly.com

For more information about our books, courses, conferences, and news, see our website at http://www.oreilly.com.

在 Facebook 上找到我们: http: //facebook.com/oreilly

Find us on Facebook: http://facebook.com/oreilly

在 Twitter 上关注我们: http: //twitter.com/oreillymedia

Follow us on Twitter: http://twitter.com/oreillymedia

在 YouTube 上观看我们的视频: http: //www.youtube.com/oreillymedia

Watch us on YouTube: http://www.youtube.com/oreillymedia

第 1 章 WebPagetest 内部结构

Chapter 1. WebPagetest Internals

帕特里克 ·米南

Patrick Meenan

我想利用今年的机会稍微了解一下WebPagetest如何从浏览器收集性能数据。Windows 上的其他工具使用类似的技术,但此处的信息可能无法代表其他工具的工作方式。

I thought I’d take the opportunity this year to give a little bit of visibility into how WebPagetest gathers the performance data from browsers. Other tools on windows use similar techniques but the information here may not be representative of how other tools work.

首先,它有助于从浏览器的角度理解 Windows 上的网络堆栈(图 1-1)。

First off, it helps to understand the networking stack on Windows from a browser’s perspective (Figure 1-1).

从浏览器的角度来看 Windows 网络堆栈

图 1-1。从浏览器的角度来看 Windows 网络堆栈

Figure 1-1. Windows networking stack from browser’s perspective

浏览器是什么并不重要,如果它在 Windows 上运行,架构几乎必须类似于上图,其中所有通信都通过 Windows 套接字 API(就此而言,几乎任何与 TCP 对话的应用程序) Windows 上的 /IP 类似于上图)。

It doesn’t matter what the browser is, if it runs on Windows, the architecture pretty much has to look like the diagram above where all of the communications go through the Windows socket APIs (for that matter, just about any application that talks TCP/IP on Windows looks like the picture above).

函数拦截

Function Interception

WebPagetest 工作原理的关键在于它能够拦截任意函数调用并在将请求或响应传递到原始实现(或选择根本不传递)之前检查或更改请求或响应。幸运的是,其他人完成了大部分繁重的工作,并提供了一个很好的开源库(http://newgre.net/ncodehook),它可以为您处理细节,但它基本上是这样工作的:

The key to how WebPagetest works is its ability to intercept arbitrary function calls and inspect or alter the request or response before passing it on to the original implementation (or choosing not to pass it on at all). Luckily someone else did most of the heavy lifting and provided a nice open source library (http://newgre.net/ncodehook) that can take care of the details for you but it basically works like this:

  • 在内存中查找目标函数(如果是从 dll 导出的话则很简单)。

  • Find the target function in memory (trivial if it is exported from a dll).

  • 复制函数的前几个字节(确保保持 x86 指令完整)。

  • Copy the first several bytes from the function (making sure to keep x86 instructions intact).

  • 使用跳转到新函数来覆盖函数条目。

  • Overwrite the function entry with a jmp to the new function.

  • 提供一个替换函数,其中包括从原始函数复制的字节以及到其余代码的跳转。

  • Provide a replacement function that includes the bytes copied from the original function along with a jmp to the remaining code.

这是非常棘手的事情,如果您不非常小心,事情往往会变得 非常错误,但是通过明确定义的函数(如所有 Windows API),您几乎可以拦截您想要的任何内容。

It’s pretty hairy stuff and things tend to go very wrong if you aren’t extremely careful, but with well-defined functions (like all of the Windows APIs), you can pretty much intercept anything you’d like.

一个问题是,您只能将调用重定向到与原始函数在同一进程中运行的代码,如果您编写了代码,这很好,但如果您试图监视您无法控制的软件,则没有多大帮助这导致我们……

One catch is that you can only redirect calls to code running in the same process as the original function, which is fine if you wrote the code but doesn’t help a lot if you are trying to spy on software that you don’t control which leads us to…

代码注入

Code Injection

对我来说幸运的是,Windows 提供了多种将任意代码注入进程的方法。这里对几种不同的技术进行了很好的概述:http://www.codeproject.com/KB/threads/winspy.aspx,实际上有更多的方法可以做到这一点,但它涵盖了基础知识。有些技术将代码插入到每个进程中,但我希望更有针对性,只检测我们感兴趣的特定浏览器实例,因此经过一系列实验(和可怕的失败)后,我最终使用了 CreateRemoteThread /LoadLibrary 技术本质上可以让您强制任何进程加载任意 dll 并在其中执行代码(假设您拥有必要的权限)。

Lucky for me, Windows provides several ways to inject arbitrary code into processes. There is a good overview of several different techniques here: http://www.codeproject.com/KB/threads/winspy.aspx, and there are actually more ways to do it than that but it covers the basics. Some of the techniques insert your code into every process but I wanted to be a lot more targeted and just instrument the specific browser instances that we are interested in, so after a bunch of experimentation (and horrible failures), I ended up using the CreateRemoteThread/LoadLibrary technique which essentially lets you force any process to load an arbitrary dll and execute code in it (assuming you have the necessary rights).

由此产生的浏览器架构

Resulting Browser Architecture

既然我们可以拦截任意函数调用,那么只需识别“有趣”的函数,最好是所有浏览器都使用的函数,这样您就可以重用尽可能多的代码。在WebPagetest中,我们拦截所有与解析主机名、连接套接字以及读取或写入数据有关的Winsock调用(图1-2)。

Now that we can intercept arbitrary function calls, it just becomes a matter of identifying the “interesting” functions, preferably ones that are used by all the browsers so you can reuse as much code as possible. In WebPagetest, we intercept all the Winsock calls that have to do with resolving host names, connecting sockets, and reading or writing data (Figure 1-2).

浏览器架构

图 1-2。浏览器架构

Figure 1-2. Browser architecture

这使我们能够通过浏览器访问所有网络访问,并且我们本质上只是跟踪浏览器正在做什么。除了必须解码原始字节流之外,它非常简单,并且为我们提供了一种在所有浏览器上进行测量的一致方法。SSL 确实增加了一些麻烦,因此我们还拦截对浏览器使用的各种 SSL 库的调用,以便我们可以看到数据的未加密版本。这对于 Chrome 来说有点困难,因为该库被编译到 Chrome 代码本身中,但幸运的是,它们为每个构建提供了调试符号,因此我们仍然可以在内存中找到代码。

This gives us access to all the network access from the browser and we essentially just keep track of what the browsers are doing. Other than having to decode the raw byte streams, it is pretty straightforward and gives us a consistent way to do the measurements across all browsers. SSL does add a bit of a wrinkle so we also intercept calls to the various SSL libraries that the browsers use in order that we can see the unencrypted version of the data. This is a little more difficult for Chrome since the library is compiled into the Chrome code itself, but luckily they make debug symbols available for every build so we can still find the code in memory.

使用相同的技术来拦截来自浏览器的绘图调用,以便我们可以知道它何时绘制到屏幕上(用于开始渲染测量)。

The same technique is used to intercept drawing calls from the browser so we can tell when it paints to the screen (for the start render measurement).

获取代码

Get the Code

由于 WebPagetest 采用 BSD 许可证,欢迎您出于您想要的任何目的重复使用任何代码。该项目位于 Google Code 上:( http://code.google.com/p/webpagetest/ ),一些更有趣的文件是:

Since WebPagetest is under a BSD license you are welcome to reuse any of the code for whatever purposes you’d like. The project lives on Google Code here: (http://code.google.com/p/webpagetest/) and some of the more interesting files are:

浏览器的进步

Browser Advancements

幸运的是,浏览器开始以标准方式公开更多有趣的信息,并且随着 W3C 资源计时规范 ( http://w3c-test.org/webperf/specs/ResourceTiming/ ) 的进步,您将能够访问大量此类信息通过 JavaScript 直接从浏览器获取信息(甚至来自您的最终用户!)。

Luckily, browsers are starting to expose more interesting information in standard ways and as the W3C Resource Timing spec (http://w3c-test.org/webperf/specs/ResourceTiming/) advances, you will be able to access a lot of this information directly from the browser through JavaScript (even from your end users!).

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/webpagetest-internals/。最初发布于 2011 年 12 月 1 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/webpagetest-internals/. Originally published on Dec 01, 2011.

第 2 章 localStorage 读取性能

Chapter 2. localStorage Read Performance

尼古拉斯 ·扎卡斯

Nicholas Zakas

Web Storage ( http://dev.w3.org/html5/webstorage/ ) 已迅速成为 Web 开发人员工具包中最受欢迎的 HTML5 相关附加功能之一。更具体地说,localStorage它已经在世界各地的 Web 开发人员的心中找到了一个家,提供了非常快速、简单的客户端数据存储,可以跨会话持久保存。通过简单的键值接口,我们已经看到网站localStorage以独特且有趣的方式利用:

Web Storage (http://dev.w3.org/html5/webstorage/) has quickly become one of the most popular HTML5-related additions to the web developer toolkit. More specifically, localStorage has found a home in the hearts and minds of web developers everywhere, providing very quick and easy client-side data storage that persists across sessions. With a simple key-value interface, we’ve seen sites take advantage of localStorage in unique and interesting ways:

在我见过的用例中,Google/Bing 方法似乎越来越受欢迎。部分原因是使用 HTML5 应用程序缓存的困难,部分原因是该技术从 Steve Souders 和其他人的工作中获得了广泛的宣传。事实上,我与人们讨论的越多localStorage,以及它对于存储 UI 相关信息的有用性,我发现越来越多的人已经开始尝试这种技术。

Of the use cases I’ve seen, the Google/Bing approach is one that seems to be gaining in popularity. This is partly due to the difficulties of working with the HTML5 application cache and partly due to the publicity that this technique has gained from the work of Steve Souders and others. Indeed, the more I talk to people about localStorage and how useful it can be for storing UI-related information, the more people I find who have already started to experiment with this technique.

我发现这种用法的有趣之处localStorage在于,有一个内置但未声明的假设:读取localStorage是一种廉价的操作。我从其他开发人员那里听说过一些奇怪的性能问题,因此我开始量化 的性能特征localStorage,以确定读取数据的实际成本。

What I find intriguing about this use of localStorage is that there’s a built-in, and yet unstated, assumption: that reading from localStorage is an inexpensive operation. I had heard anecdotally from other developers about strange performance issues, and so I set out to quantify the performance characteristics of localStorage, to determine the actual cost of reading data.

基准

The Benchmark

不久前,我创建并分享了一个简单的基准测试,用于测量从localStorage对象属性中读取值与从对象属性中读取值的情况。其他几个人调整了基准以获得更可靠的版本(http://jsperf.com/localstorage-vs-objects/10)。最终结果:在每个浏览器中读取数据比从对象属性读取相同数据localStorage要慢几个数量级。具体慢了多少?看一下图 2-1中的图表(数字越高越好)。

Not too long ago, I created and shared a simple benchmark that measured reading a value from localStorage against reading a value from an object property. Several others tweaked the benchmark to arrive at a more reliable version (http://jsperf.com/localstorage-vs-objects/10). The end result: reading from localStorage is orders of magnitude slower in every browser than reading the same data from an object property. Exactly how much slower? Take a look at the chart on Figure 2-1 (higher numbers are better).

基准测试结果

图 2-1。基准测试结果

Figure 2-1. Benchmark results

看完此图表后,您可能会感到困惑,因为似乎localStorage没有表示读取。事实上,它是有表现的,只是你看不到它,因为 数字太低了,用这个比例甚至看不到。除了 Safari 5 能够localStorage实际显示读数之外,其他浏览器之间的差异都非常大,以至于在这张图表上无法看到。当我调整 Y 轴值时,您现在可以看到测量值如何跨浏览器叠加:

You may be confused after looking at this chart because it appears that reading from localStorage isn’t represented. In fact, it is represented, you just can’t see it because the numbers are so low as to not even be visible with this scale. With the exception of Safari 5, whose localStorage readings actually show up, every other browser has such a large difference that there’s no way to see it on this chart. When I adjust the Y-axis values, you can now see how the measurements stack up across browsers:

缩放结果

图 2-2。缩放结果

Figure 2-2. Scaled results

通过更改 Y 轴的比例,您现在可以看到localStorage与对象属性读取的真实比较(图 2-2)。但尽管如此,两者之间的差异仍然如此之大,以至于近乎滑稽。为什么?

By changing the scale of the Y-axis, you’re now able to see a true comparison of localStorage versus object property reads (Figure 2-2). But still, the difference between the two is so vast that it’s almost comical. Why?

这是怎么回事?

What’s Going On?

为了在浏览器会话中保持不变,中的值localStorage将写入磁盘。这意味着当您从 读取值时localStorage,您实际上是从硬盘驱动器读取一些字节。读取和写入硬盘驱动器是昂贵的操作,尤其是与读取和写入内存相比。本质上,这正是我的基准测试所测试的:从内存(对象属性)读取值与从磁盘读取值的速度(localStorage)。

In order to persist across browser sessions, values in localStorage are written to disk. That means when you’re reading a value from localStorage, you’re actually reading some bytes from the hard drive. Reading from and writing to a hard drive are expensive operations, especially as compared to reading from and writing to memory. In essence, that’s exactly what my benchmark was testing: the speed of reading a value from memory (object property) compared to reading a value from disk (localStorage).

更有趣的是,localStorage数据是按源存储的,这意味着浏览器中的两个或多个选项卡可以localStorage同时访问相同的数据。对于需要弄清楚如何同步跨选项卡访问的浏览器实现者来说,这是一个很大的痛苦。当您尝试读取 时 localStorage,浏览器需要先停止并查看是否有任何其他选项卡正在访问同一区域。如果是这样,则必须等到访问完成才能读取该值。

Making matters more interesting is the fact that localStorage data is stored per-origin, which means that it’s possible for two or more tabs in a browser to be accessing the same localStorage data at the same time. This is a big pain for browser implementors who need to figure out how to synchronize access across tabs. When you attempt to read from localStorage, the browser needs to stop and see if any other tab is accessing the same area first. If so, it must wait until the access is finished before the value can be read.

因此,与读取相关的延迟localStorage是可变的——这在很大程度上取决于浏览器在该时间点发生的其他事情。

So the delay associated with reading from localStorage is variable—it depends a lot on what else is going on with the browser at that point in time.

优化策略

Optimization Strategy

考虑到读取 是有成本的localStorage,这对您使用它的方式有何影响?在得出结论之前,我运行了另一个基准测试(http://jsperf.com/localstorage-string-size)来确定从localStorage. 基准测试将四个不同大小的字符串(100 个字符、500 个字符、1,000 个字符和 2,000 个字符)保存到localStorage 然后读出。结果有点令人惊讶:在所有浏览器中,读取的数据量并不 影响读取发生的速度。

Given that there is a cost to reading from localStorage, how does that affect how you would use it? Before coming to a conclusion, I ran another benchmark (http://jsperf.com/localstorage-string-size) to determine the effect of reading different-sized pieces of data from localStorage. The benchmarks saves four different size strings, 100 characters, 500 characters, 1,000 characters, and 2,000 characters, into localStorage and then reads them out. The results were a little surprising: across all browsers, the amount of data being read did not affect how quickly the read happened.

我多次进行了测试,并恳求我的 Twitter 关注者 ( https://twitter.com/slicknet/status/139475625793699840 ) 获取更多信息。可以肯定的是,不同浏览器之间确实存在一些差异,但没有大到足以真正产生影响的程度。我的结论是:从单个密钥读取多少数据并不重要localStorage

I ran the test multiple times and implored my Twitter followers (https://twitter.com/slicknet/status/139475625793699840) to get more information. To be certain, there were definitely a few variances across browsers, but none that were large enough that it really makes a difference. My conclusion: it doesn’t matter how much data you read from a single localStorage key.

我跟进了另一个基准(http://jsperf.com/localstorage-string-size-retrieval)来测试我的新结论,即最好尽可能少地进行读取。结果与早期基准测试相关,即在大多数浏览器中读取 100 个字符 10 次比读取 10,000 个字符一次慢 90% 左右。

I followed up with another benchmark (http://jsperf.com/localstorage-string-size-retrieval) to test my new conclusion that it’s better to do as few reads as possible. The results correlated with the earlier benchmark in that reading 100 characters 10 times was around 90% slower across most browsers than reading 10,000 characters one time.

鉴于此,读取数据的最佳策略localStorage是使用尽可能少的键来存储尽可能多的数据。由于读取 10 个字符与读取 2,000 个字符所需的时间大致相同,因此请尝试将尽可能多的数据放入单个值中。每次打电话getItem()(或从 localStorage财产中读取信息)时,您都会受到打击,因此请确保您从费用中获得最大收益。将数据(无论是变量还是对象属性)放入内存的速度越快,所有后续操作的速度就越快。

Given that, the best strategy for reading data from localStorage is to use as few keys as possible to store as much data as possible. Since it takes roughly the same amount of time to read 10 characters as it does to read 2,000 characters, try to put as much data as possible into a single value. You’re getting hit each time you call getItem() (or read from a localStorage property), so make sure that you’re getting the most out of the expense. The faster you get data into memory, either a variable or an object property, the faster all subsequent actions.

跟进

Follow Up

自从我第一次发表这篇文章以来,已经有很多关于localStorage性能的讨论。它始于 Mozilla 的 Chris Heilmann 的一篇博客文章,标题为“localStorage 没有简单的解决方案”。localStorage在那篇文章中,克里斯介绍了整体上存在性能问题的想法。在包括我自己在内的其他人发表了几篇后续博客文章之后,我终于能够联系到 Jonas Sicking,他是负责 localStorageFirefox 实施的工程师之一。确实,存在性能问题localStorage,但它并不像读取比简单对象上的读取时间长一点那么简单。问题的核心在于localStorage是一个同步 API,这使得浏览器在实现方面几乎没有选择。所有 localStorage数据都存储在磁盘上的文件中。这意味着为了让您能够在 JavaScript 中访问该数据,浏览器必须首先将该文件读入内存。当发生读取时,就会出现性能问题。第一次访问 时可能会发生这种情况 localStorage,但随后浏览器会在读取发生时冻结。当处理少量数据时,这可能不是什么大问题,但如果您使用了整个 5 MB 限制,则可能会产生明显的影响。localStorageFirefox 采用的另一种解决方案是在加载页面时读取数据文件。这确保了以后访问localStorage 尽可能快并且具有可预测的性能。该方法的缺点是从文件读取可能会对页面的加载时间产生不利影响。当我写这篇文章时,这个特定问题仍然没有解决方案。有些人呼吁用全新的 API 来取代,localStorage而另一些人则打算修复现有的 API。无论发生什么,很快就会在客户端数据存储领域进行更多的研究。

In the time since I first published this article, there has been a lot of discussion around localStorage performance. It began with a blog post by Mozilla's Chris Heilmann titled, “There's No Simple Solution for localStorage.” In that post, Chris introduced the idea that localStorage as a whole has performance problems. After several follow up blog posts by others, including myself, I was finally able to get in touch with Jonas Sicking, one of the engineers responsible for implementing localStorage in Firefox. Indeed, there is a performance issue with localStorage, but it's not as simple as reads taking a bit longer than reads on the simple object. The heart of the problem is that localStorage is a synchronous API, which leaves the browser with very few choices as to implementation. All localStorage data is stored in a file on disk. That means in order for you to have access to that data in JavaScript the browser must first read that file into memory. When that read occurs is the performance issue. It could occur with the first access of localStorage, but then the browser would freeze while the read happened. That may not be a big deal when dealing with a small amount of data, but if you've used the whole 5 MB limit, there could be a noticeable effect. Another solution, the one employed by Firefox, is to read the localStorage data file as a page is being loaded. This ensures that later access to localStorage is as fast as possible and has predictable performance. The downside of that approach is that the read from file could adversely affect the loading time of the page. As I'm writing this, there is still no solution to this particular problem. Some are calling for a completely new API to replace localStorage while others are intent on fixing the existing API. Regardless of what happens, there is likely to be a lot more research done in the area of client-side data storage soon.

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/localstorage-read-performance/。最初发布于 2011 年 12 月 2 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/localstorage-read-performance/. Originally published on Dec 02, 2011.

第 3 章 为什么内联所有内容并不是答案

Chapter 3. Why Inlining Everything Is NOT the Answer

盖伊 ·波贾尼

Guy Podjarny

我经常被问到最好的前端优化是否不是简单地内联所有内容。内联所有内容意味着将所有脚本、样式和图像嵌入到 HTML 中,并将它们作为一个大包提供。

Every so often I get asked if the best frontend optimization wouldn’t be to simply inline everything. Inlining everything means embedding all the scripts, styles, and images into the HTML, and serving them as one big package.

这个问题是过度采用最佳实践的一个很好的例子。是的,减少 HTTP 请求的数量是一个有价值的最佳实践。是的,内联所有内容是减少请求数量(理论上减少到一个)的最终方法。但不,这不是让您的网站更快的最佳方法。

This question is a great example of taking a best practice too far. Yes, reducing the number of HTTP requests is a valuable best practice. Yes, inlining everything is the ultimate way to reduce the number of requests (in theory to one). But NO, it’s not the best way to make your site faster.

虽然减少请求是一种很好的做法,但这并不是唯一重要的方面。如果您内联所有内容,您就实现了“减少请求”目标,但您会错过许多其他目标。以下是您不应内联所有内容的一些具体原因。

While reducing requests is a good practice, it’s not the only aspect that matters. If you inline everything, you fulfill the “Reduce Requests” goal, but you’re missing many others. Here are some of the specific reasons you shouldn’t inline everything.

无浏览器缓存

No Browser Caching

内联所有内容最明显的问题是缓存的丢失。如果HTML保存了所有资源,并且HTML本身不可缓存,则每次都会重新下载资源。这意味着新网站上的首页加载速度可能会更快,但后续页面或回访者的页面加载速度会较慢。

The most obvious problem with inlining everything is the loss of caching. If the HTML holds all the resources, and the HTML is not cacheable by itself, the resources are re-downloaded every time. This means the first page load on a new site may be faster, but subsequent pages or return visitors would experience a slower page load.

例如,我们看一下《纽约时报》首页的重复访问情况(表3-1图3-1)。得益于缓存,原始站点加载时间为 2.7 秒。如果我们内联该页面上的 JavaScript 文件,重复访问加载时间将攀升至 3.2 秒,并且大小会加倍。从视觉上看,由于 JavaScript 对渲染的影响,负面影响要大得多。

For example, let’s look at the repeat visit of the New York Times’ home page (Table 3-1, Figure 3-1). Thanks to caching, the original site loads in 2.7 seconds. If we inline the JavaScript files on that page, the repeat visit load time climbs to 3.2 seconds, and the size doubles. Visually, the negative impact is much greater, due to JavaScript’s impact on rendering.

表 3-1。www.nyt.com IE8;DSL;弗吉尼亚州杜勒斯

Table 3-1. www.nyt.com IE8; DSL; Dulles, VA

重复查看加载时间# 要求# 字节

原始站点

Original Site

2.701秒

2.701 seconds

46

46

101 KB

101 KB

内联外部 JS 文件

Inlined External JS Files

3.159秒

3.159 seconds

36

36

212KB

212 KB

纽约时报网

图 3-1。纽约时报网

Figure 3-1. www.nyt.com

即使 HTML 是可缓存的,缓存持续时间也必须是页面上所有资源的最短持续时间。如果您的 HTML 可缓存 10 分钟,并且页面中的资源可缓存一天,那么您实际上也将资源的可缓存性降低到 10 分钟。

Even if the HTML is cacheable, the cache duration has to be the shortest duration of all the resources on the page. If your HTML is cacheable for 10 minutes, and a resource in the page is cacheable for a day, you’re effectively reducing the cacheability of the resource to be 10 minutes as well.

无边缘缓存

No Edge Caching

CDN 的传统价值称为边缘缓存:在 CDN 边缘缓存静态资源。缓存资源直接从边缘提供,因此比一路路由到源服务器获取资源要快得多。

The traditional value of CDNs is called Edge Caching: caching static resources on the CDN edge. Cached resources are served directly from the edge, and thus delivered much faster than routing all the way to the origin server to get them.

内联数据时,资源被捆绑到 HTML 中,从 CDN 的角度来看,整个事情只是一个 HTTP 响应。如果 HTML 不可缓存,则整个 HTTP 响应也不可缓存。因此,每次用户请求页面时,都需要从源获取 HTML 及其所有资源,而在标准情况下,许多资源可以从边缘缓存提供。

When inlining data, the resources are bundled into the HTML, and from the CDN’s perspective, the whole thing is just one HTTP response. If the HTML is not cacheable, this entire HTTP response isn’t cacheable either. Therefore, the HTML and all of its resources would need to be fetched from the origin every time a user requests the page, while in the standard case many of the resources could have been served from the Edge Cache.

因此,即使是首次访问您网站的访问者,从具有内联资源的页面获得的体验也可能比从具有链接资源的页面获得的体验慢。当客户端从远离服务器的位置进行浏览时尤其如此。

As a result, even first-time visitors to your site are likely to get a slower experience from a page with inlined resources than from a page with linked resources. This is especially true when the client is browsing from a location far from your server.

例如,让我们看一下从巴西使用 IE8 和电缆连接浏览 Apple 主页。(表 3-2图 3-2)将站点修改为内联图像将加载时间从约 2.4 秒增加到约 3.1 秒,这可能是因为内联图像数据必须从原始服务器而不是 CDN 获取。虽然请求数量减少了 30%,但页面速度实际上变慢了。

For example, let’s take a look at browsing the Apple home page from Brazil, using IE8 and a cable connection. (Table 3-2, Figure 3-2) Modifying the site to inline images increased the load time from about 2.4s to about 3.1s, likely since the inlined image data had to be fetched from the original servers and not the CDN. While the number of requests decreased by 30%, the page was in fact slower.

表 3-2。www.apple.com IE8;电缆; 巴西圣保罗

Table 3-2. www.apple.com IE8; Cable; Sao Paolo, Brazil

第一视角加载时间# 要求# 字节

原始站点

Original Site

2.441秒

2.441 seconds

36

36

363 KB

363 KB

内嵌图像

Inlined Images

3.157秒

3.157 seconds

26

26

361 KB

361 KB

苹果网站

图 3-2。苹果网站

Figure 3-2. www.apple.com

无需按需加载

No Loading On-Demand

按需加载资源是性能优化的一个重要类别,它尝试仅在实际需要时加载资源。资源可以被引用,但在条件需要之前不会被实际下载和评估。

Loading resources on-demand is an important category of performance optimizations, which attempt to only load a resource when it’s actually required. Resources may be referenced, but not actually downloaded and evaluated until the conditions require it.

浏览器为 CSS 图像提供内置的按需加载机制。如果 CSS 规则引用了背景图像,则仅当页面上至少有一个元素与该规则匹配时,浏览器才会下载该图像。另一个例子是按需加载图像(http://www.blaze.io/technical/the-impact-of-image-optimization/),它仅在页面图像滚动到视图中时下载它们。移动网页设计的渐进增强方法使用类似的概念,仅根据需要加载 JavaScript 和 CSS。

Browsers offer a built-in loading-on-demand mechanism for CSS images. If a CSS rule references a background image, the browser would only download it if at least one element on the page matched the rule. Another example is loading images on-demand (http://www.blaze.io/technical/the-impact-of-image-optimization/), which only downloads page images as they scroll into view. The Progressive Enhancement approach to Mobile Web Design uses similar concepts for loading JavaScript and CSS only as needed.

由于内联资源是服务器上做出的决定,因此它不会从按需加载中受益。这意味着所有图像(CSS 或页面图像)都会被嵌入,无论特定客户端上下文是否需要它们。通常,通过内联获得的价值低于因不进行这些其他优化而损失的价值。

Since inlining resources is a decision made on the server, it doesn’t benefit from loading on-demand. This means all the images (CSS or page images) are embedded, whether they’re needed by the specific client context or not. More often than not, the value gained by inlining is lower than the value lost by not having these other optimizations.

作为一个例子,我以《太阳报》的主页为例,对其应用了两种相互冲突的优化(表3-3图3-3)。第一个按需加载图像,第二个内联所有图像。按需加载图像时,页面大小加起来约为 1MB,加载时间约为 9 秒。内联图像时,页面大小增加到近 2MB,加载时间增加到 16 秒。无论哪种方式,页面都会发出许多请求,但内联图像和按需图像之间的负载和大小差异非常明显。

As an example, I took The Sun’s home page and applied two conflicting optimizations to it (Table 3-3, Figure 3-3). The first loads images on demand, and the second inlines all images. When loading images on demand, the page size added up to about 1MB, and load time was around 9 seconds. When inlining images, the page size grew to almost 2MB, and the load time increased to 16 seconds. Either way the page makes many requests, but the load and size differences between inlining images and images on-demand are very noticeable.

表 3-3。www.thesun.co.uk IE8;DSL;弗吉尼亚州杜勒斯

Table 3-3. www.thesun.co.uk IE8; DSL; Dulles, VA

第一视角加载时间# 要求# 字节

按需加载图像

Loading Images On-Demand

9.038秒

9.038 seconds

194

194

1,028 KB

1,028 KB

内嵌图像

Inlined Images

16.190 秒

16.190 seconds

228

228

1,979 KB

1,979 KB

www.thesun.co.uk

图 3-3。www.thesun.co.uk

Figure 3-3. www.thesun.co.uk

使浏览器前瞻无效

Invalidates Browser Look-Ahead

现代浏览器使用智能启发式方法来尝试提前预取页面底部的资源。例如,如果您的网站在 HTML 末尾 引用了http://www.3rdparty.com/code.js ,则浏览器可能会解析www.3rdparty.com的 DNS ,甚至可能开始下载该文件,早在它能够实际执行之前。

Modern browsers use smart heuristics to try and prefetch resources at the bottom of the page ahead of time. For instance, if your site references http://www.3rdparty.com/code.js towards the end of the HTML, the browser is likely to resolve the DNS for www.3rdparty.com, and probably even start downloading the file, long before it can actually execute it.

在标准网站中,HTML 本身很小,因此浏览器只需下载几十 KB 即可看到整个 HTML。一旦它看到(并解析)整个 HTML,它就可以开始预取它认为合适的内容。如果大量使用内联,HTML 本身会变得更大,大小可能超过 0.5MB。下载时,浏览器无法查看和加速页面下方的资源,其中许多是您无法内联的第三方工具。

In a standard website, the HTML itself is small, and so the browser only needs to download a few dozen KB before it sees the entire HTML. Once it sees (and parses) the entire HTML, it can start prefetching as it sees fit. If you’re making heavy use of inlining, the HTML itself becomes much bigger, possibly over 0.5MB in size. While downloading it, the browser can’t see and accelerate the resources further down the page—many of which are third-party tools you couldn’t inline.

有缺陷的解决方案:仅在第一次访问时内联所有内容

Flawed Solution: Inline Everything only on First Visit

缓存问题的部分解决方案的工作原理如下:

A partial solution to the caching problem works as follows:

  • 用户第一次访问您的网站时,内联所有内容并为用户设置 cookie

  • The first time a user visits your site, inline everything and set a cookie for the user

  • 页面加载后,将所有资源下载为单独的文件。

  • Once the page loads, download all the resources as individual files.

  • 如果用户访问该页面并拥有 cookie,则假设它在缓存中拥有文件,并且不要内联数据。

  • If a user visits the page and has the cookie, assume it has the files in the cache, and don’t inline the data.

虽然总比没有好,但该解决方案的缺陷是它假设页面要么完全缓存,要么完全不缓存。事实上,网站和缓存状态非常不稳定。用户的缓存只能容纳不到一天的浏览数据:平均用户每天浏览 88 个页面 ( http://blog.newrelic.com/wp-content/uploads/infog_061611.png ),平均页面大小为 930KB ( http://httparchive.org/interesting.php#bytesperpage ),大多数桌面浏览器缓存的数据不超过 75MB ( http://www.blaze.io/mobile/understanding-mobile-cache-sizes/ )。对于移动设备来说,这个比例甚至更糟。

While better than nothing, the flaw in this solution is that it assumes a page is either entirely cached or entirely not cached. In reality, websites and cache states are extremely volatile. A user’s cache can only hold less than a day’s worth of browsing data: An average user browses 88 pages/day (http://blog.newrelic.com/wp-content/uploads/infog_061611.png), an average page weighs 930KB (http://httparchive.org/interesting.php#bytesperpage), and most desktop browsers cache no more than 75MB of data (http://www.blaze.io/mobile/understanding-mobile-cache-sizes/). For mobile, the ratio is even worse.

另一方面,Cookie 通常会在规定的到期日之前有效。因此,使用 cookie 来预测缓存状态很快就会变得毫无意义,然后您就会回到根本不内联的状态。

Cookies, on the other hand, usually live until their defined expiry date. Therefore, using a cookie to predict the cache state becomes pointless very quickly, and then you’re just back to not inlining at all.

该解决方案的最大问题之一是它的演示效果比实际效果更好。在综合测试中,如 WebPageTest 测试,页面确实要么完全缓存(即,其所有资源都被缓存),要么根本不缓存。因此,这些测试使首次访问内联方法看起来像是万能的,这是完全错误的。

One of the biggest problems with this solution is that it demos better than it really is. In synthetic testing, like WebPageTest tests, a page is indeed either fully cached (i.e., all its resources are cached), or it’s not cached at all. These tests therefore make the inline-on-first-visit approach look like the be all and end all, which is just plain wrong.

另一个重要问题是并非所有 CDN 都支持通过 cookie 改变缓存。因此,如果您的某些页面是可缓存的,或者您认为稍后可以将它们设置为可缓存的,则可能很难甚至不可能让 CDN 缓存页面的两个不同版本,并根据 cookie 选择要提供服务的版本。

Another significant problem is that not all CDNs support varying cache by a cookie. Therefore, if some of your pages are cacheable, or if you think you might make them cacheable later, it may be hard to impossible to get the CDN to cache two different versions of your page, and choose the one to serve based on a cookie.

总结和建议

Summary and Recommendations

我们的世界不是非黑即白的。减少请求数量是加速网站速度的好方法,但这并不意味着它是唯一的解决方案。如果你做得太过分,你最终会减慢你的网站速度,而不是加快速度。

Our world isn’t black and white. The fact that reducing the number of requests is a good way to accelerate your site doesn’t mean it’s the only solution. If you take it too far, you’ll end up slowing down your site, not speeding it up.

尽管有这些限制,内联仍然是前端优化领域中一个很好且重要的工具。因此,您应该使用它,但要小心不要滥用它。以下是有关何时使用内联的一些建议,但请记住,您应该验证它们是否在您自己的网站上获得了正确的效果:

Despite all these limitations, inlining is still a good and important tool in the world of frontend Optimization. As such, you should use it, but be careful not to abuse it. Here are some recommendations about when to use inlining, but keep in mind you should verify that they get the right effect on your own site:

非常小的文件应该内联。
Very small files should be inlined.

请求和响应的 HTTP 开销通常约为 1KB,因此小于该值的文件绝对应该内联。我们的测试表明您几乎不应该内联文件大于 4KB。

The HTTP overhead of a request and response is often ~1KB, so files smaller than that should definitely be inlined. Our testing shows you should almost never inline files bigger than 4KB.

页面图像(即从页面引用的图像,而不是 CSS)很少应该内联。
Page images (i.e., images referenced from the page, not CSS) should rarely be inlined.

页面图像往往尺寸较大,在正常使用中不会阻塞其他资源,而且它们往往比 CSS 和脚本更频繁地更改。要优化图像文件加载,请改为按需加载图像 ( http://www.blaze.io/technical/the-impact-of-image-optimization/ )。

Page images tend to be big in size, they don’t block other resources in the normal use, and they tend to change more frequently than CSS and Scripts. To optimize image file loading, load images on-demand instead (http://www.blaze.io/technical/the-impact-of-image-optimization/).

任何对于首屏页面视图不重要的内容都不应该内联。
Anything that isn’t critical for the above-the-fold page view should not be inlined.

相反,它应该推迟到页面加载之后,或者至少异步。

Instead, it should be deferred till after page load, or at least made async.

小心内联 CSS 图像。
Be careful with inlining CSS images.

许多 CSS 文件在许多页面之间共享,其中每个页面仅使用三分之一或更少的规则。如果您的网站就是这种情况,那么如果您不内联这些图像,您的网站很有可能会更快。

Many CSS files are shared across many pages, where each page only uses a third or less of the rules. If that’s the case for your site, there’s a decent chance your site will be faster if you don’t inline those images.

不要仅依赖综合测量——使用 RUM(真实用户监控)。
Don’t rely only on synthetic measurements—use RUM (Real User Monitoring).

像 WebPageTest 这样的工具是无价的,但它们并不能显示一切。测量现实世界的性能并将该信息与综合测试结果一起使用。

Tools like WebPageTest are priceless, but they don’t show everything. Measure real world performance and use that information alongside your synthetic test results.

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/why-inlined-everything-is-not-the-answer/。最初发布于 2011 年 12 月 3 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/why-inlining-everything-is-not-the-answer/. Originally published on Dec 03, 2011.

第 4 章 异步代码片段的艺术和工艺

Chapter 4. The Art and Craft of the Async Snippet

斯托扬 ·斯特凡诺夫

Stoyan Stefanov

JavaScript 下载会阻止其他页面组件的加载。这就是为什么以非阻塞异步方式加载脚本文件很重要(使其成为关键)的原因。如果这对您来说是新的,您可以从雅虎用户界面 (YUI) 库博客 ( http://www.yuiblog.com/blog/2008/07/22/non-blocking-scripts/ )上的这篇文章开始,或者性能日历文章 ( http://calendar.perfplanet.com/2010/the-truth-about-non-blocking-javascript/ )。

JavaScript downloads block the loading of other page components. That’s why it’s important (make that critical) to load script files in a nonblocking asynchronous fashion. If this is new to you, you can start with this post on the Yahoo User Interface (YUI) library blog (http://www.yuiblog.com/blog/2008/07/22/non-blocking-scripts/) or the Performance Calendar article (http://calendar.perfplanet.com/2010/the-truth-about-non-blocking-javascript/).

在这篇文章中,我将从第三方的角度研究该主题 - 当您是第三方时,为其他开发人员提供一个片段以包含在他们的页面上。无论是广告、插件、小部件、访问计数器、分析还是其他任何东西。

In this post, I’ll examine the topic from the perspective of a third party—when you’re the third party, providing a snippet for other developers to include on their pages. Be it an ad, a plug-in, widget, visits counter, analytics, or anything else.

让我们详细看看 Facebook 的 JavaScript SDK 是如何解决这个问题的。

Let’s see in much detail how this issue is addressed in Facebook’s JavaScript SDK.

Facebook 插件 JS SDK

The Facebook Plug-ins JS SDK

Facebook JavaScript SDK是一段多用途代码,可让您集成 Facebook 服务、进行 API 调用以及加载社交插件,例如“赞”按钮 ( https://developers.facebook.com/docs/reference/plugins/like /)。

The Facebook JavaScript SDK is a multipurpose piece of code that lets you integrate Facebook services, make API calls, and load social plug-ins such as the Like button (https://developers.facebook.com/docs/reference/plugins/like/).

当涉及“赞”按钮和其他社交插件时,SDK 的任务是解析页面的 HTML 代码,查找要替换为 插件的元素(例如<fb:like>或)。<div class="fb-like">插件本身是一个 iframe,它指向 facebook.com/plugins/like.php具有适当 URL 参数和适当大小的内容。

The task of the SDK when it comes to Like button and other social plug-ins is to parse the page’s HTML code looking for elements (such as <fb:like> or <div class="fb-like">) to replace with a plug-in. The plug-in itself is an iframe that points to something like facebook.com/plugins/like.php with the appropriate URL parameters and appropriately sized.

这是此类插件 URL 的示例:

This is an example of one such plug-in URL:

https://www.facebook.com/plugins/like.php?href=bookofspeed.com&layout=box_count

https://www.facebook.com/plugins/like.php?href=bookofspeed.com&layout=box_count

JavaScript SDK 的 URL 如下所示:

The JavaScript SDK has a URL like so:

http://connect.facebook.net/en_US/all.js

http://connect.facebook.net/en_US/all.js

问题是如何将此代码包含在您的页面上。传统上,这是最简单的(但阻塞)方式:

The question is how do you include this code on your page. Traditionally it has been the simplest possible (but blocking) way:

<script src="http://connect.facebook.net/en_US/all.js"></script>
<script src="http://connect.facebook.net/en_US/all.js"></script>

不过,从社交插件诞生的第一天起,就一直可以异步加载该脚本,并且保证可以正常工作。此外,几个月前,当各种向导类型配置器生成 SDK 片段代码时,异步片段成为默认值。

Since day one of the social plug-ins though, it has always been possible to load this script asynchronously and it was guaranteed to work. Additionally, a few months ago the async snippet became the default when SDK snippet code is being generated by the various wizard-type configurators.

图 4-1显示了示例配置器的外观。

Figure 4-1 shows how an example configurator looks like.

喜欢按钮配置器

图 4-1。喜欢按钮配置器

Figure 4-1. Like button configurator

异步代码看起来比传统代码更复杂(更长),但对于主机页面的整体加载速度来说,这是非常值得的。

The async code looks more complicated (it’s longer) than the traditional one, but it’s well worth it for the overall loading speed of the host page.

在检查此代码片段之前,让我们看看设计第三方提供商代码片段时的一些目标是什么。

Before we inspect this snippet, let’s see what some of the goals were when designing a third-party provider snippet.

设计目标

Design Goals

  • 该片段应该很小。不一定以字节数来衡量,但总的来说,它不应该看起来令人生畏。

  • The snippet should be small. Not necessarily measured in number of bytes, but overall it shouldn’t look intimidating.

  • 虽然很小,但应该可读。所以不允许缩小。

  • Even though it’s small, it should be readable. So no minifying allowed.

  • 它应该在“敌对”环境中工作。您无法控制主页。它可能是有效的 XTHML 严格页面,可能缺少文档类型,甚至可能缺少(或具有多个) <body><head><html>任何其他标记。

  • It should work in “hostile” environments. You have no control over the host page. It may be a valid XTHML-strict page, it may be missing doctype, it may even be missing (or have more than one) <body>, <head>, <html> or any other tag.

  • 该片段应该易于复制粘贴。除了规模小之外,这还意味着它应该可以正常工作,因为使用此代码的人甚至可能不是开发人员。或者,如果他们是开发人员,他们可能不一定有时间阅读文档。这也意味着有些人会在同一页面上多次粘贴该代码片段,尽管 JS 每个页面只需要加载一次。

  • The snippet should be copy-paste-friendly. In addition to being small that means it should just work, because people using this code may not even be developers. Or, if they are developers, they may not necessarily have the time to read documentation. That also means that some people will paste that snippet of code many times on the same page, even though the JS needs to be loaded only once per page.

  • 它应该对主机页面不显眼,这意味着它不应该留下任何全局变量和其他残留物,当然,除了包含的 JavaScript 之外。

  • It should be unobtrusive to the host page, meaning it should leave no globals and other leftovers, other than, of course, the included JavaScript.

片段

The Snippet

Facebook 插件配置器中的代码片段如下所示:

The snippet in the Facebook plug-in configurators looks like so:

<script>(function(d, s, id) {
  var js, fjs = d.getElementsByTagName(s)[0];
  if (d.getElementById(id)) return;
  js = d.createElement(s); js.id = id;
  js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
  fjs.parentNode.insertBefore(js, fjs);
}(document, 'script', 'facebook-jssdk'));</script>
<script>(function(d, s, id) {
  var js, fjs = d.getElementsByTagName(s)[0];
  if (d.getElementById(id)) return;
  js = d.createElement(s); js.id = id;
  js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
  fjs.parentNode.insertBefore(js, fjs);
}(document, 'script', 'facebook-jssdk'));</script>

看看这里发生了什么。

Take a look at what’s going on here.

在第一行和最后一行,您会看到整个代码段被包装在一个立即(又名,自调用,又名自执行)函数中。这是为了确保任何临时变量保留在本地范围内,并且不会渗透到主机页面的全局命名空间中。

On the first and last line you see that the whole snippet is wrapped in an immediate (a.k.a., self-invoking, aka self-executing) function. This is to assure that any temporary variables remain in the local scope and don’t bleed into the host page’s global namespace.

在第 1 行,您还可以看到立即函数接受三个参数,这些参数在调用函数时在最后一行提供。这些参数是document对象和两个字符串的简写,所有这些参数稍后都会在函数中多次使用。将它们作为参数传递比在函数体中定义它们要短一些。它还节省了一行(垂直空间),因为另一个选项类似于:

On line 1, you can also see that the immediate function accepts three arguments, and these are supplied on the last line when the function is invoked. These arguments are shorthands to the document object and two strings, all of which are used more than once later in the function. Passing them as arguments is somewhat shorter than defining them in the body of the function. It also saves a line (vertical space), because the other option is something like:

<script>(function() {
  var js, fjs = d.getElementsByTagName(s)[0],
      d = document, s = 'script', id = 'facebook-jssdk';
  // the rest...
}());</script>
<script>(function() {
  var js, fjs = d.getElementsByTagName(s)[0],
      d = document, s = 'script', id = 'facebook-jssdk';
  // the rest...
}());</script>

这将多一行(记住我们需要可读的片段,而不是过长的行)。此外,第一行和最后一行将有“未使用”的空间,因为它们有点短。

This would be one line longer (remember we want readable snippet, not overly long lines). Also the first and the last line will have “unused” space as they are somewhat short.

将重复document分配给较短的 d 之类的东西会使整个片段更短,并且可能会稍微快一些,因为 d 是本地的,查找速度比全局更快document

Having things like the repeating document assigned to a shorter d makes the whole snippet shorter and also probably marginally faster as d is local which is looked up faster than the global document.

接下来我们有:

Next we have:

var js, fjs = d.getElementsByTagName(s)[0];
var js, fjs = d.getElementsByTagName(s)[0];

此行声明一个变量并查找<script>页面上的第一个可用元素。我稍后会讲到这一点。

This line declares a variable and finds the first available <script> element on the page. I’ll get to that in a second.

第 3 行检查脚本是否已经存在于页面上,如果是,则提前退出,因为没有其他事情可做:

Line 3 checks whether the script isn’t already on the page and if so, exits early as there’s nothing more to do:

if (d.getElementById(id)) return;
if (d.getElementById(id)) return;

我们只需要该文件一次。当人们在同一页面上多次复制并粘贴此代码时,此行可防止脚本文件被多次包含。对于常规的阻止脚本标记来说,这尤其糟糕,因为最终结果类似于(假设页面的博客文章类型):

We only need the file once. This line prevents the script file from being included several times when people copy and paste this code multiple times on the same page. This is especially bad with a regular blocking script tag because the end result is something like (assuming a blog post type of page):

<script src="...all.js"></script>
<fb:like /> <!-- one like button at the top of the blog post -->

<script src="...all.js"></script>
<fb:like/> <!-- second like like button at the end of the post -->

<script src="...all.js"></script>
<fb:comments/> <!-- comments plugin after the article -->

<script src="...all.js"></script>
<fb:recommendations/> <!-- sidebar with recommendations plugin -->
<script src="...all.js"></script>
<fb:like /> <!-- one like button at the top of the blog post -->

<script src="...all.js"></script>
<fb:like/> <!-- second like like button at the end of the post -->

<script src="...all.js"></script>
<fb:comments/> <!-- comments plugin after the article -->

<script src="...all.js"></script>
<fb:recommendations/> <!-- sidebar with recommendations plugin -->

这会导致重复的 JavaScript,这很糟糕 ( http://developer.yahoo.com/performance/rules.html#js_dupes ),因为某些浏览器最终可能会多次下载该文件。

This results in a duplicate JavaScript, which is all kinds of bad (http://developer.yahoo.com/performance/rules.html#js_dupes), because some browsers may end up downloading the file several times.

即使 JavaScript 是异步的,即使浏览器足够聪明,不会重新解析它,它仍然需要重新执行它,在这种情况下,脚本会覆盖自身,一次又一次地重新定义其函数和对象。非常不受欢迎。

Even if the JavaScript is asynchronous and even if the browser is smart enough not to reparse it, it will still need to re-execute it, in which case the script overwrites itself, redefining its functions and objects again and again. Highly undesirable.

因此,具有类似 id 的脚本'facebook-jssdk'不太可能与主机页面上的某些内容发生冲突,让我们检查该文件是否已包含在内。如果情况并非如此,我们继续前进。

So having the script with an id like 'facebook-jssdk' which is unlikely to clash with something on the host page, lets us check if the file has already been included. If that’s not the case, we move on.

下一行创建一个script 元素并分配 ID,以便我们稍后检查:

The next line creates a script element and assigns the ID so we can check for it later:

js = d.createElement(s); js.id = id;
js = d.createElement(s); js.id = id;

以下行设置脚本的源:

The following line sets the source of the script:

js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";

请注意,URL 的协议丢失。这意味着将使用主机页面的协议加载脚本。如果主机页面使用http://,脚本加载速度会更快,如果页面使用,https:// 将不会出现混合内容安全提示。

Note that the protocol of the URL is missing. This means that the script will be loaded using the host page’s protocol. If the host page uses http://, the script will load faster, and if the page uses https:// there will be no mixed content security prompts.

最后,我们将新创建的js元素附加到主页的 DOM 中,就完成了:

Finally, we append the newly created js element to the DOM of the host page and we’re done:

fjs.parentNode.insertBefore(js, fjs);
fjs.parentNode.insertBefore(js, fjs);

这是如何运作的?嗯,fjs是页面上可用的第一个 (f) JavaScript (js) 元素。我们早些时候在第 2 行抓住了它。js我们在 之前插入新元素fjs。假设,如果主页在 后面有一个 script 元素body,那么:

How does that work? Well, fjs is the first (f) JavaScript (js) element available on the page. We grabbed it earlier on line #2. We insert our new js element right before the fjs. If, let’s say, the host page has a script element right after the body, then:

  • fjs是脚本。

  • fjs is the script.

  • fjs.parentNode是身体。

  • fjs.parentNode is the body.

  • 新脚本插入到body旧脚本之间script

  • The new script is inserted between the body and the old script.

追加替代方案

Appending Alternatives

为什么整个都出问题了parentNode.insertBefore?有更简单的方法可以将节点添加到 DOM 树,例如使用附加到<head>或,但是这是保证在几乎所有情况下都有效的方法。让我们看看其他人失败的原因。<body>appendChild()

Why the trouble with the whole parentNode.insertBefore? There are simpler ways to add a node to the DOM tree, like appending to the <head> or to the <body> by using appendChild(), however this is the way that is guaranteed to work in nearly all cases. Let’s see why the others fail.

这是一个常见的模式:

Here is a common pattern:

document.getElementsByTagName('head')[0].appendChild(js);
document.getElementsByTagName('head')[0].appendChild(js);

或者在较新的浏览器中可用的变体document.head

Or a variation if document.head is available in newer browsers:

(document.head || document.getElementsByTagName('head')[0]).appendChild(js);
(document.head || document.getElementsByTagName('head')[0]).appendChild(js);

问题是您无法控制主页的标记。如果页面没有元素怎么办head?浏览器无论如何都会创建该节点吗?事实证明,大多数情况下,是的,但有些浏览器(Opera 8、Android 1)不会创建头部。Steve Souders 的 BrowserScope 测试证明了这一点 ( http://stevesouders.com/tests/autohead.html )。

The problem is that you don’t control the markup of the host page. What if the page doesn’t have a head element? Will the browser create that node anyways? Turns out that most of the times, yes, but there are browsers (Opera 8, Android 1) that won’t create the head. A BrowserScope test by Steve Souders demonstrates this (http://stevesouders.com/tests/autohead.html).

那么呢body?你得有身体。所以你应该能够做到:

What about the body? You gotta have the body. So you should be able to do:

document.body.appendChild(js);
document.body.appendChild(js);

我创建了一个 browserscope 测试(http://www.phpied.com/files/bscope/autobody.html),但找不到不会创建的浏览器 document.body。但是当异步代码片段脚本元素是嵌套的而不是主体的直接子元素时,IE7 中仍然会出现可爱的“操作中止”错误。

I created a browserscope test (http://www.phpied.com/files/bscope/autobody.html) and couldn’t find a browser that will not create document.body. But there’s still the lovely “Operation Aborted” error which occurs in IE7 when the async snippet script element is nested and not a direct child of the body.

最后的机会:

Last chance:

document.documentElement.firstChild.appendChild(js);
document.documentElement.firstChild.appendChild(js);

document.documentElement是 HTML 元素,它的第一个子元素必须是 head。事实证明不一定。如果 HTML 元素后面有注释,WebKit 会将注释作为第一个子元素提供给您。有一个测试用例的调查显示了这一点(http://robert.accettura.com/blog/2009/12/12/adventures-with-document-documentelement-firstchild/)。

document.documentElement is the HTML element and its first child must be the head. Not necessarily, as it turns out. If there’s a comment following the HTML element, WebKits will give you the comment as the first child. There’s an investigation with a test case that show this (http://robert.accettura.com/blog/2009/12/12/adventures-with-document-documentelement-firstchild/).

哇!

Whew!

尽管有可能的替代方案,但使用第一个可用script节点似乎insertBefore是最具弹性的选项。总是至少有一个script节点,即使那是script片段本身的节点。

Despite the possible alternatives, it appears that using the first available script node and insertBefore is the most resilient option. There’s always going to be at least one script node, even if that’s the script node of the snippet itself.

(嗯,“总是”在 Web 开发中是一个很重的词。正如 @kangax ( http://twitter.com/kangax ) 曾经指出的那样,您可以将代码片段放在 a 中,<body onload="...">瞧——神奇!——一个没有script节点的脚本.)

(Well, “always” is a strong word in web development. As @kangax (http://twitter.com/kangax) pointed out once, you can have the snippet inside a <body onload="..."> and voila—magic!—a script without a script node.)

少了什么东西?

What’s Missing?

您可能会注意到此代码片段中缺少一些您可能在其他代码示例中看到的内容。

You may notice some things missing in this snippet that you may have seen in other code examples.

例如,没有:

For instance there are none of:

js.async = true;
js.type = "text/javascript";
js.language = "JavaScript";
js.async = true;
js.type = "text/javascript";
js.language = "JavaScript";

这些都是默认的,不需要占用空间,所以就省略了。一些早期的 Firefox 版本是例外async,但脚本已经足够非阻塞和异步了。

These are all defaults which don’t need to take up space, so they were omitted. Exception is the async in some earlier Firefox versions, but the script is already nonblocking and asynchronous enough anyway.

标签本身也是如此<script> 。它是一个 HTML5 有效的基本标签,没有typelanguage属性。

Same goes for the <script> tag itself. It’s an HTML5-valid bare-bones tag with no type or language attributes.

第一方

First Parties

整个讨论是从第三方脚本提供商的角度出发的。如果您控制标记,有些事情可能会有所不同并且更容易。您可以安全地引用头部,因为您知道它就在那里。您不必检查重复插入,因为您只需插入一次。所以你最终可能会得到更简单的东西,例如:

This whole discussion was from the perspective of a third-party script provider. If you control the markup, some things might be different and easier. You can safely refer to the head because you know it’s there. You don’t have to check for duplicate insertions, because you’re only going to insert it once. So you may end up with something much simpler, such as:

<script>(function(d) {
  var js = d.createElement('script');
  js.src = "http://example.org/my.js";
  (d.head || d.getElementsByTagName('head')[0]).appendChild(js);
}(document));</script>
<script>(function(d) {
  var js = d.createElement('script');
  js.src = "http://example.org/my.js";
  (d.head || d.getElementsByTagName('head')[0]).appendChild(js);
}(document));</script>

这就是您控制主页时所需要的一切。

This is all it takes when you control the host page.

此外,我们一直假设只要脚本到达,它就会运行。但您可能有不同的需求,例如在脚本准备好后调用特定函数。在这种情况下,您需要收听 js.onloadjs.onreadystatechange(例如: http: //www.phpied.com/javascript-include-ready-onload/)。在更复杂的示例中,您可能需要加载多个脚本并保证它们的执行顺序。此时,您可能需要查看任何可用的脚本加载器项目,例如 LAB.js ( http://labjs.com/ ) 或 head.js ( http://headjs.com/ ),它们专门设计用于解决这些案例。

Also we assumed all the time that whenever the script arrives, it just runs. But you may have different needs, for example call a specific function once the script is ready. In which case you need to listen to js.onload and js.onreadystatechange (example: http://www.phpied.com/javascript-include-ready-onload/). In even more complex examples, you may want to load several scripts and guarantee their order of execution. At this point you may want to look into any of the available script loader projects such as LAB.js (http://labjs.com/) or head.js (http://headjs.com/) which are specially designed to solve these cases.

临别赠言:站在巨人的肩膀上

Parting Words: On the Shoulders of Giants

有点令人不安的是,我们 Web 开发人员需要竭尽全力来确保异步脚本执行(无论是否在第三方环境中)。有一天,当我们身后有一些死浏览器时,我们将能够简单地说script async=true,它会正常工作。同时,我希望这篇文章能够作为尚未解决此问题的人们的资源减轻一些痛苦,并有望为他们节省一些时间。

It’s a little disturbing that we, the web developers, need to go to all these lengths to assure an asynchronous script execution (in a third-party environment or not). One day, with a few dead browsers behind us, we’ll be able to simply say script async=true and it will just work. Meanwhile, I hope that this post will alleviate some of the pain as a resource to people who are yet to come to this problem and will hopefully save them some time.

Google AdSense 人员在与社区分享他们的进展时经历了很多尝试和错误,Mathias Bynens 还为他们的代码片段写了一篇鼓舞人心的评论 ( http://mathiasbynens.be/notes/async-analytics-snippet )。Steve Souders ( http://stevesouders.com/ ) 对此主题进行了研究并撰写了文章,MSN.com 可能是最早使用这种加载 JavaScript 技术的网站之一。雅虎和许多其他公司都有关于该主题的文章。这些是帮助寻找“完美”片段的一些巨头。谢谢你!

Google AdSense folks have gone through a lot of trial and error while sharing their progress with the community, and Mathias Bynens also wrote an inspirational critique (http://mathiasbynens.be/notes/async-analytics-snippet) of their snippet. Steve Souders (http://stevesouders.com/) has done research and written about this topic, and MSN.com was probably among the first to use such a technique for loading JavaScript. There are writeups from Yahoo and many others on the topic. These are some of the giants that have helped in the search of the “perfect” snippet. Thank you!

(嘘,如果您在代码片段中看到一些不完美的地方,请大声说出来!)

(Psst, and if you see something that is less than perfect in the snippet, please speak up!)

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/the-art-and-craft-of-the-async-snippet/。最初发布于 2011 年 12 月 4 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/the-art-and-craft-of-the-async-snippet/. Originally published on Dec 04, 2011.

第 5 章运营商网络:陷入困境

Chapter 5. Carrier Networks: Down the Rabbit Hole

蒂姆 ·卡德莱克

Tim Kadlec

在刘易斯·卡罗尔的《爱丽丝梦游仙境》中,爱丽丝相信她在跟随兔子掉进洞里后可能永远无法离开自己所处的房间。她开始质疑自己的决定:

There’s a point in Lewis Carroll’s Alice's Adventures in Wonderland where Alice believes she may never be able to leave the room she has found herself in after following the rabbit down its hole. She starts to question her decision:

我几乎希望我没有掉进那个兔子洞——然而——然而——你知道,这种生活相当奇怪。

I almost wish I hadn’t gone down that rabbit hole—and yet—and yet—it’s rather curious, you know, this kind of life.

移动性能的世界也有同样的感觉——尤其是当您开始探索移动运营商网络时。如果您正在寻找一致性和稳定性,您应该寻找其他地方。另一方面,如果您喜欢在不稳定环境的混乱中找到的能量和兴奋,那么您会发现自己就像在家里一样。

The world of mobile performance can feel the same—particularly when you start to explore mobile carrier networks. If you’re looking for consistency and stability, you should look elsewhere. If, on the other hand, you enjoy the energy and excitement found in the chaos that surrounds an unstable environment, then you’ll find yourself right at home.

变化性

Variability

系统的复杂度可能由其变量的数量决定,而运营商网络的变量也非常多。它们的性能会根据位置、使用网络的人数、天气、运营商等因素而发生巨大变化,没有什么可以保证您能够保持静态。

The complexity of a system may be determined by the number of its variables, and carrier networks have a lot of variables. Their performance varies dramatically depending on factors such as location, the number of people using a network, the weather, the carrier—there isn’t much that you can rely on to remain static.

一项研究 ( http://www.pcworld.com/article/167391/a_day_in_the_life_of_3g.html ) 证明了不同地点之间存在多大差异。该测试涉及检查美国各个城市的三个不同移动运营商(Sprint、Verizon 和 AT&T)的 3G 网络带宽。结果的多样性令人震惊。

One study (http://www.pcworld.com/article/167391/a_day_in_the_life_of_3g.html) demonstrated just how much variance there can be from location to location. The test involved checking bandwidth on 3G networks for three different mobile carriers—Sprint, Verizon, and AT&T—in various cities across the United States. The diversity of the results were stunning.

新奥尔良 Verizon 网络的最高记录带宽为 1425 kbps。纽约市 AT&T 的速度最低为 477 kbps,相差 948 kbps。即使在单个运营商内,这种差异也是显着的。虽然 Verizon 的最高带宽为 1425 kbps,但在俄勒冈州波特兰,其记录的最低带宽为 622 kbps。

The highest recorded bandwidth was 1425 kbps in New Orleans on a Verizon network. The lowest was 477 kbps in New York City in AT&T—a difference of 948 kbps. Even within a single carrier, the variation was remarkable. While Verizon topped out at 1425 kbps, their lowest recorded bandwidth was 622 kbps in Portland, Oregon.

另一个非正式实验(http://www.webperformancetoday.com/2011/10/26/interesting-findings-3g-mobile-performance-is-up-to-10x-slower-than-throttled-broadband-service/)是最近由约书亚·比克斯比 (Joshua Bixby) 指挥。Joshua 随机记录了他的 3G 网络的带宽和延迟量。即使在他家这个单一位置,延迟也从 100 多毫秒一直到 350 毫秒不等。

Another informal experiment (http://www.webperformancetoday.com/2011/10/26/interesting-findings-3g-mobile-performance-is-up-to-10x-slower-than-throttled-broadband-service/) was recently conducted by Joshua Bixby. Joshua randomly recorded the amounts of bandwidth and latency on his 3G network. Even within a single location, his house, the latency varied from just over 100 ms all the way up to 350 ms.

潜伏

Latency

已发布的有关移动网络延迟的信息非常少。2010 年,雅虎!根据他们所做的一项小型研究( http://www.yuiblog.com/blog/2010/04/08/analyzing-bandwidth-and-latency/ )发布了一些信息。进入 YUI 博客的流量受到带宽和延迟的监控。这些数字按连接类型进行平均,并将结果以图表形式发布。他们的研究表明,移动连接的平均延迟为 430 毫秒,而有线连接的平均延迟仅为 130 毫秒。

Remarkably little information about mobile network latency has been published. In 2010, Yahoo! released some information based on a small study (http://www.yuiblog.com/blog/2010/04/08/analyzing-bandwidth-and-latency/) they had done. Traffic coming into the YUI blog was monitored for both bandwidth and latency. These numbers were averaged by connection type and the results published as a graph. Their study showed that the average latency for a mobile connection was 430 ms, compared to only 130 ms for an average cable connection.

这项研究并非万无一失。样本量很小,访问 YUI 博客的受众类型并不完全代表普通人。至少是公开发布的数据。迄今为止发布的其余大部分延迟数据都没有太多背景信息;通常没有提及它是如何测量的。

The study isn’t foolproof. The sample size was small and the type of audience that would be visiting the YUI blog is not exactly a representation of the average person. At least it was publicly released data. Most of the rest of the latency numbers released so far come without much context; there usually isn’t any mention of how it was measured.

转码

Transcoding

移动网络的另一个问题是运营商转码引起的频繁问题。例如,许多网络尝试减小图像的文件大小。有时,这是在不被注意的情况下完成的。然而,结果通常是图像变得颗粒状或模糊,并且网站的外观受到负面影响。

Another concern with mobile networks are frequent issues caused by carrier transcoding. Many networks, for example, attempt to reduce the file size of images. Sometimes, this is done without being noticed. Often, however, the result is that images become grainy or blurry and the appearance of the site is affected in a negative way.

英国《金融时报》致力于通过使用 dataURI 来避免其移动 Web 应用程序出现此问题 ( http://www.tomhume.org/2011/10/appftcom-and-the-cost-of-cross-platform-web-apps. html),但即使这种技术也不是完全安全的。虽然该问题尚未得到充分记录或孤立,但英国的一些开发人员报告说,英国最大的移动提供商之一 O2 有时会删除 dataURI。

The Financial Times worked to avoid this issue with their mobile web app by using dataURIs instead (http://www.tomhume.org/2011/10/appftcom-and-the-cost-of-cross-platform-web-apps.html), but even this technique is not entirely safe. While the issue is not well documented or isolated yet, a few developers in the UK have reported that O2, one of the largest mobile providers in the UK, will sometimes strip out dataURIs.

转码并不止于图像。T-Mobile 最近被发现删除了任何看起来像 Javascript 评论的内容 ( http://www.mysociety.org/2011/08/11/mobile-operators-breaking-content/ )。这些意图大多是光荣的,但方法却会带来问题。例如,jQuery 库有一个包含*/*. 稍后在库中,您可以再次找到相同的字符串。看到这两个字符串后,T-Mobile 会删除中间的所有内容,从而在此过程中破坏许多网站。

Transcoding doesn’t stop at images. T-Mobile was recently found to be stripping out anything that looked like a Javascript comment (http://www.mysociety.org/2011/08/11/mobile-operators-breaking-content/). The intentions were mostly honorable, but the method leads to issues. The jQuery library, for example, has a string that contains */*. Later on in the library, you can again find the same string. Seeing these two strings, T-Mobile would then strip out everything that was in between—breaking many sites in the process.

这种转码方法也可能会给任何试图通过首先注释掉 JavaScript 来延迟加载的人带来问题(http://googlecode.blogspot.com/2009/09/gmail-for-mobile-html5-series-reducing .html ) — 一种流行且有效的技术,用于改进解析和页面加载时间。

This method of transcoding could also create issues for anyone who is trying to lazy-load their Javascript by first commenting it out (http://googlecode.blogspot.com/2009/09/gmail-for-mobile-html5-series-reducing.html) — a popular and effective technique for improving parse and page load time.

运营商Optus不仅通过降低图像分辨率导致图像模糊,而且还会以阻塞的方式向页面注入外部脚本(http://www.zdnet.com.au/optus-3g-accelerator-spawns-模糊图片-339303393.htm)。不幸的是,大多数这些转码问题和技术都没有被充分公开或记录良好。我怀疑还有无数其他人正在等待被发现。

One carrier, Optus, not only causes blurry images by lowering the image resolution, but also injects an external script into the page in a blocking manner (http://www.zdnet.com.au/optus-3g-accelerator-spawns-blurry-pics-339303393.htm). Unfortunately, most of these transcoding issues and techniques are not very exposed or well documented. I suspect countless others are just waiting to be discovered.

那里的山里有黄金

Gold in Them There Hills

这听起来有点令人沮丧,但这不是这里的目标。我们需要进一步探索运营商网络,因为如果我们愿意深入挖掘,我们可以挖掘出令人难以置信的丰富信息。

This can all sound a bit discouraging, but that’s not the goal here. We need to explore carrier networks further because there is an incredible wealth of information we can unearth if we’re willing to dig far enough.

其中一个例子是 Steve Souders 最近测试的不活动计时器和状态机的想法 ( http://www.stevesouders.com/blog/2011/09/21/making-a-mobile-connection/ )。移动网络依靠不同的状态来确定分配的吞吐量,这反过来又会影响电池消耗。为了在状态之间进行向下切换(从而减少电池消耗,同时也减少吞吐量),运营商会发送一个不活动计时器。不活动计时器向设备发出信号,指示其应切换到更节能的状态。这可能会对性能产生很大影响,因为可能需要一两秒钟才能恢复到最高状态。正如您可能怀疑的那样,这个不活动计时器因运营商而异。史蒂夫已经设置了一个测试(http://stevesouders.com/ms/)您可以运行它来尝试确定不活动计时器可能在当前连接上触发的位置。结果虽然并非万无一失,但确实强烈表明这些计时器可能存在显着差异。

One example of this is the idea of inactivity timers and state machines that Steve Souders was recently testing (http://www.stevesouders.com/blog/2011/09/21/making-a-mobile-connection/). Mobile networks rely on different states to determine allotted throughput, which in turn affects battery drain. To down-switch between states (thereby reducing battery drain, but also throughput) the carrier sends an inactivity timer. The inactivity timer signals to the device that it should shift to a more energy-efficient state. This can have a large impact on performance because it can take a second or two to ramp back up to the highest state. This inactivity timer, as you might suspect, varies from carrier to carrier. Steve has set up a test (http://stevesouders.com/ms/) that you can run in an attempt to identify where the inactivity timer might fire on your current connection. The results, while not foolproof, do strongly suggest that these timers can be dramatically different.

我们需要更多此类信息和测试。网络最初并不是针对数据进行优化的;它们针对语音进行了优化。当 3G 网络推出时,人们期望数据流量的主要来源将来自图片消息等。唯一可访问的移动互联网是 WAP——一种非常简化的网络版本。

We need more of this kind of information and testing. Networks weren’t originally optimized for data; they were optimized for voice. When 3G networks were rolled out, the expectation was that the major source of data traffic would come from things like picture messaging. The only accessible mobile Internet was WAP—a very simplified version of the Web.

然而,随着设备的功能变得越来越强大,在这些设备上体验完整的互联网变得可能。人们开始期待看到的不仅仅是有限版本的互联网,而是整个互联网(视频、猫图片等等),导致网络不堪重负。

As devices became more and more capable, however, it became possible to experience the full Internet on these devices. People started expecting to see not just a limited version of the Internet, but the whole thing (videos, cat pictures, and all) leaving the networks overwhelmed.

毫无疑问,运营商正在采用与这些转码方法和状态机类似的其他技术来绕过其网络的限制,以便为更多客户提供更快的数据服务。

There are undoubtedly other techniques, similar to these transcoding methods and state machines, that carriers are doing to get around the limitations of their network in order to provide faster data services to more customers.

4G 救不了我们

4G Won’t Save Us

许多人喜欢指出即将推出的 4G 网络可以缓解这些担忧。在某种程度上,他们是对的——它确实有助于解决一些延迟和带宽问题。然而,对于运营商来说,进行这种转变是一项相当昂贵的努力,这意味着我们不应该指望在一夜之间广泛推出。

Many people like to point to the upcoming roll-out of 4G networks as a way of alleviating many of these concerns. To some extent, they’re right—it will indeed help with some of the latency and bandwidth issues. However, it’s a pretty costly endeavor for carriers to make that switch meaning that we shouldn’t expect widespread roll-out overnight.

即使进行了切换,我们也可以预期运营商使用的质量、覆盖范围和优化方法也不会统一。威廉·吉布森说:“未来已经到来,只是分布不均。” 移动连接也有非常相似的情况。

Even when the switch has been made we can expect that the quality, coverage and methods of optimization used by the carriers will not be uniform. William Gibson said, “The future is already here—it’s just not evenly distributed.” Something very similar could be said of mobile connectivity.

我们该何去何从?

Where Do We Go from Here?

为了推动这一讨论,我们需要做一些事情。首先,改善开发商、制造商和运营商之间的沟通将大有帮助。如果不是 AT&T 的研究论文 ( http://www.research.att.com/articles/featured_stories/2011_03/201102_Energy_efficient ),我们可能仍然不知道运营商状态机和不活动计时器对性能的影响。此类更多信息不仅提示我们优化移动性能的独特考虑因素,而且还为我们提供了一些视角。我们被提醒,这不仅仅与加载时间有关;还与加载时间有关。还有其他因素在起作用,我们需要考虑权衡。

To move this discussion forward, we need a few things. For starters, some improved communication between developers, manufacturers, and carriers would go a long, long way. If not for AT&T’s research paper (http://www.research.att.com/articles/featured_stories/2011_03/201102_Energy_efficient), we may still not be aware of the performance impact of carrier state machines and inactivity timers. More information like this not only cues us into the unique considerations of optimizing for mobile performance, but also gives us a bit of perspective. We are reminded that it’s not just about load time; there are other factors at play and we need to consider the trade-offs.

改进的通信也可以大大减少由转码方法引起的问题。以 T-Mobile 错误删除评论为例。如果在实施此方法之前与开发人员进行某种公开对话,那么问题可能会在该功能上线之前就被发现。

Improved communication could also go a long way toward reducing the issues caused by transcoding methods. Take the case of T-Mobile’s erroneous comment stripping. Had there been some sort of open dialogue with developers before implementing this method, the issues would probably have been caught well before the feature made it live.

我们还可以使用更多工具。移动性能测试工具的数量和质量正在不断提高。然而,我们仍然可以使用很少的工具来测试真实设备和真实网络上的性能。随着导航计时 API的采用,这将有助于改善这种情况。然而,仍然有足够的空间来创建更强大的测试工具。

We could also use a few more tools. The number—and quality—of mobile performance testing tools is improving. Yet we still have precious few tools at our disposal for testing performance on real devices, over real networks. As the Navigation Timing API gains adoption, that will help to improve the situation. However, there will still be ample room for the creation of more robust testing tools as well.

光在隧道的尽头

Light at the End of the Tunnel

你知道,最终爱丽丝走出了那个小房间。她继续经历了许多冒险并遇到了许多有趣的生物。醒来后,她想,这是一个多么美妙的梦啊。随着我们的工具不断改进,我们进一步探索这个兔子洞,有一天我们也将能够理解这一切。当我们做我们的应用程序时,我们的网站将会变得更好。

You know, eventually Alice gets out of that little room. She goes on to have many adventures and meet many interesting creatures. After she wakes up, she thinks what a wonderful dream it had been. As our tools continue to improve and we explore this rabbit hole further, one day we, too, will be able to make some sense of all of this. When we do our applications and our sites will be better for it.

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/carrier-networks-down-the-rabbit-hole/。最初发布于 2011 年 12 月 5 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/carrier-networks-down-the-rabbit-hole/. Originally published on Dec 05, 2011.

第 6 章 HTTP 中并行性的需求

Chapter 6. The Need for Parallelism in HTTP

布赖恩 ·潘恩

Brian Pane

简介:从楼梯上摔下来

Introduction: Falling Down the Stairs

图 6-1中的图像是瀑布图的一部分,显示了 IE8 浏览器为下载电子商务网站主页上的图形而执行的 HTTP 请求。

The image on Figure 6-1 is part of a waterfall diagram showing the HTTP requests that an IE8 browser performed to download the graphics on the home page of an e-commerce website.

笔记

Note

网站名称和 URL 均经过模糊处理,以隐藏网站的身份。当我们稍后会看到,许多其他网站都遇到同样的问题时,点名指出一个网站作为性能不佳的例子是不公平的。

The site name and URLs are blurred to conceal the site’s identity. It would be unfair to single out one site by name as an example of poor performance when, as we’ll see later, so many others suffer the same problem.

阶梯瀑布图案

图 6-1。阶梯瀑布图案

Figure 6-1. Stair-step waterfall pattern

此瀑布示例中看到的阶梯图案显示了几个值得注意的事情:

The stair-step pattern seen in this waterfall sample shows several noteworthy things:

  • 客户端对每个服务器主机名使用六个并发的持久连接,这是现代桌面浏览器中的典型 ( http://www.browserscope.org/?category=network ) 配置。

  • The client used six concurrent, persistent connections per server hostname, a typical (http://www.browserscope.org/?category=network) configuration among modern desktop browsers.

  • 在每个连接上,浏览器连续发出 HTTP 请求:它在发送下一个请求之前等待每个请求的响应。

  • On each of these connections, the browser issued HTTP requests serially: it waited for a response to each request before sending the next request.

  • 该序列中的所有请求都是相互独立的;图像 URL 是在瀑布中之前加载的 CSS 文件中指定的。因此,重要的是,客户端并行下载所有这些图像是有效的

  • All the requests in this sequence were independent of each other; the image URLs were specified in a CSS file loaded earlier in the waterfall. Thus, significantly, it would be valid for a client to download all these images in parallel.

  • 客户端和服务器之间的往返时间 (RTT) 约为 125 毫秒。因此,许多对小对象的请求只花费了 1 个多一点的 RTT。浏览器下载页面上所有 N 个小图像所花费的时间非常接近 (N * RTT / 6),这表明下载时间很大程度上是 HTTP 请求数量的函数(除以 6,这要归功于浏览器使用多个连接)。

  • The round-trip time (RTT) between the client and server was approximately 125ms. Thus many of these requests for small objects took just over 1 RTT. The elapsed time the browser spent downloading all N of the small images on the page was very close to (N * RTT / 6), demonstrating that the download time was largely a function of the number of HTTP requests (divided by six, thanks to the browser’s use of multiple connections).

  • 响应数据量非常小:在瀑布的这一部分中,大约 1 秒内总共有 25KB,平均吞吐量低于 0.25 Mb/s。本次测试运行的客户端有几Mb/s的下行网络带宽,因此请求的序列化导致可用带宽的利用效率低下

  • The amount of response data was quite small: a total of 25KB in about 1 second during this part of the waterfall, for an average throughput of under 0.25 Mb/s. The client in this test run had several Mb/s of downstream network bandwidth, so the serialization of requests resulted in inefficient utilization of the available bandwidth.

当前最佳实践:围绕 HTTP 进行工作

Current Best Practices: Working around HTTP

有几种成熟的技术可以避免这种阶梯模式及其 (N * RTT / 6) 运行时间。除了使用 CDN 来降低 RTT 和客户端缓存来降低 N 的有效值外,网站开发人员还可以应用多种内容优化

There are several well-established techniques for avoiding this stair-step pattern and its (N * RTT / 6) elapsed time. Besides using CDNs to reduce the RTT and client-side caching to reduce the effective value of N, the website developer can apply several content optimizations:

  • 精灵图像。

  • Sprite the images.

  • 将图像内联为数据:样式表中的 URI。

  • Inline the images as data: URIs in a stylesheet.

  • 如果某些图像恰好是渐变或圆角,请使用 CSS3 功能来完全消除对这些图像的需要。

  • If some of the images happen to be gradients or rounded corners, use CSS3 features to eliminate the need for those images altogether.

  • 应用域分片将 (N * RTT / 6) 的分母增加一个小的常数因子。

  • Apply domain sharding to increase the denominator of (N * RTT / 6) by a small constant factor.

尽管这些内容优化众所周知,但图 6-1中的瀑布示例表明它们并不总是得到应用。根据作者的经验,即使是注重性能的组织有时也会推出速度较慢的网站,因为速度只是争夺有限开发时间的众多优先事项之一。​

Although these content optimizations are well known, examples like the waterfall in Figure 6-1 show that they are not always applied. In the author’s experience, even performance-conscious organizations sometimes launch slow websites, because speed is just one of many priorities competing for limited development time.​

因此,一个有趣的问题是:一般网站在多大程度上避免了阶梯式 HTTP 请求序列化模式?

Thus an interesting question is: how well has the average website avoided the stair-step HTTP request serialization pattern?

实验:挖掘 HTTP 档案

Experiment: Mining the HTTP Archive

HTTP Archive ( http://httparchive.org/ ) 是一个包含 HTTP 请求详细记录的数据库,其中包括真实浏览器从 Alexa 下载全球数万个网站主页时发出的分辨率为 1ms 的计时数据热门网站列表。

The HTTP Archive (http://httparchive.org/) is a database containing detailed records of the HTTP requests–including timing data with 1ms resolution that a real browser issued when downloading the home pages of tens of thousands of websites from the Alexa worldwide top sites list.

通过这个数据集,我们可以在每个网页中找到序列化的请求序列。第一步是从 HTTP Archive下载每个页面的 HAR ( http://www.softwareishard.com/blog/har-12-spec/ ) 文件。该文件包含页面的 HTTP 请求列表,我们可以根据简单的启发式定义找到请求的序列化序列:​

With this data set, we can find serialized sequences of requests in each web page. The first step is to download each page’s HAR (http://www.softwareishard.com/blog/har-12-spec/) file from the HTTP Archive. This file contains a list of the HTTP requests for the page, and we can find serialized sequences of requests based on a simple, heuristic definition:​

  • 序列化序列中的所有 HTTP 请求都必须是同一​scheme:host:port 的 GET。

  • All the HTTP requests in the serialized sequence must be GETs for the same ​scheme:host:port.

  • 除第一个之外的每个 HTTP 事务都必须在序列中的其他事务完成后立即开始(在可用计时数据的 1 毫秒分辨率内)。

  • Each HTTP transaction except the first must begin immediately upon the completion of some other transaction in the sequence (within the 1ms resolution of the available timing data).

  • 除最后一个事务外,每个事务的 HTTP 响应状态都必须为 2xx。

  • Each transaction except the last must have an HTTP response status of 2xx.

  • image/png除最后一个事务外,每个事务都必须具有、image/gif或 的响应内容类型image/jpeg

  • Each transaction except the last must have a response content-type of image/png, image/gif, or image/jpeg.

此定义捕获了一组按顺序运行的 HTTP 请求的概念,因为浏览器缺乏并行运行它们的方法,而不是因为所请求的资源之间的内容相互依赖性。该定义出于谨慎考虑而犯了错误,排除了非图像请求,因为 JavaScript、CSS 或 SWF 文件可能是后续任何请求的先决条件。在接下来的讨论中,我们假设浏览器在序列开始时就知道序列化序列中所有图像的 URL,这有点过于乐观。​

This definition captures the concept of a set of HTTP requests that are run sequentially because the browser lacks a way to run them in parallel, rather than because of content interdependencies among the requested resources. The definition errs on the side of caution by excluding non-image requests, on the grounds that a JavaScript, CSS, or SWF file might be a prerequisite for any request that follows. In the discussion that follows, we err slightly on the side of optimism by assuming that the browser knew the URLs of all the images in a serialized sequence at the beginning of the sequence.​

结果:序列化比比皆是

Results: Serialization Abounds

图 6-2中的直方图显示了 HTTP Archive 2011 年 12 月 1 日数据集中的 49,854 个网页中每页最长序列化请求序列的分布。

The histogram on Figure 6-2 shows the distribution of the longest serialized request sequences per page among 49,854 web pages from the HTTP Archive’s December 1, 2011 data set.

每页最长序列化请求序列的分布

图 6-2。每页最长序列化请求序列的分布

Figure 6-2. Distribution of the longest serialized request sequences per page

在本次调查中大约3%的网页中,没有请求序列化(即最长序列化请求长度为1)。从请求并行化的角度来看,这些页面已经得到了很好的优化。

In approximately 3% of the web pages in this survey, there is no serialization of requests (i.e., the longest serialized request length is one). From a request parallelization perspective, these pages already are quite well optimized.

在接下来的 30% 的网页中,最长的序列化请求序列的长度为二或三。这些页面可能会从增加的请求并行化中适度受益,并且像域分片这样的简单方法就足够了。

In the next 30% of the web pages, the longest serialized request sequence has a length of two or ​three. These pages might benefit modestly from increased request parallelization, and a simple approach like domain sharding would suffice.

其余三分之二的网页具有长度为 4 或更大的序列化请求序列。虽然内容优化可以改善这些页面的请求并行化,但如此多的网站具有如此多的序列化这一事实表明,内容优化的障碍并不小。​

The remaining two thirds of the web pages have serialized request sequences of length 4 or greater. While content optimizations could improve the request parallelization of these pages, the fact that so many sites have so much serialization suggests that the barriers to content optimization are nontrivial. ​

建议:是时候修复协议了

Recommendations: Time to Fix the Protocols

在不优化内容的情况下加速网站的一种方法是更广泛地实施 HTTP 请求管道。HTTP/1.1 自 RFC 2068 起就支持管道传输,但由于担心损坏的代理会错误处理管道化请求,大多数桌面浏览器尚未实现该功能。此外,队列头阻塞也是一个不小的问题。最近的工作重点是服务器向客户端提供有关哪些资源可以安全通过管道传输的提示( http://tools.ietf.org/html/draft-nottingham-http-pipeline-01 )。然而,移动浏览器开始更普遍地使用管道技术。

One way to speed up websites without content optimization would be through more widespread implementation of HTTP request pipelining. HTTP/1.1 has supported pipelining since RFC 2068, but most desktop browsers have not implemented the feature due to concerns about broken proxies that mishandle pipelined requests. In addition, head-of-queue blocking is a nontrivial problem; recent efforts have focused on ways for the server to give the clients hints (http://tools.ietf.org/html/draft-nottingham-http-pipeline-01) about what resources are safe to pipeline. Mobile browsers, however, are beginning to use pipelining more commonly.

另一种方法是在 HTTP 下引入多路复用会话层,以便客户端可以并行发出请求。此策略的一个示例是 SPDY ( http://www.chromium.org/spdy ),目前在 Chrome 中受支持,很快 ( http://bitsup.blogspot.com/2011/11/video-of-spdy-talk- at-codebitseu.html)在 Firefox 中。

Another approach is to introduce a multiplexing session layer beneath HTTP, so that the client can issue requests in parallel. An example of this strategy is SPDY (http://www.chromium.org/spdy), supported currently in Chrome and soon (http://bitsup.blogspot.com/2011/11/video-of-spdy-talk-at-codebitseu.html) in Firefox.

无论是通过管道还是多路复用,业界似乎都值得寻求协议级解决方案来提高 HTTP 请求并行度。

Whether through pipelining or multiplexing, it appears worthwhile for the industry to pursue protocol-level solutions to increase HTTP request parallelization.

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/the-need-for-parallelism-in-http/。最初发布于 2011 年 12 月 6 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/the-need-for-parallelism-in-http/. Originally published on Dec 06, 2011.

第 7 章自动化网站性能

Chapter 7. Automating Website Performance

乔什 ·弗雷泽

Josh Fraser

我相信自动化是网络性能优化的下一阶段。有很多优化手动实施很繁琐,或者可以通过自动化方式更好地完成。当然,这正是我们在 Torbit ( http://torbit.com/ ) 所做的事情 — 采用所有最佳实践,让每个人都能受益,而无需担心技术细节。

I believe that automation is the next phase for web performance optimization. There are a lot of optimizations that are tedious to implement by hand or can simply be done better in an automated fashion. Of course, this is exactly what we’re doing at Torbit (http://torbit.com/) — taking all the best practices and making the benefits accessible to everyone without you having to worry about the technical details.

在这里,我介绍了自动化的一些挑战,以及我们通过我们的服务优化数百个站点所学到的一些经验教训。我解释了为什么沿着 YSlow ( http://developer.yahoo.com/performance/rules.html ) 或 Page Speed ( http://code.google.com/speed/page-speed/ ) 列表是危险的docs/rules_intro.html)优化并尝试自动化它们,而不考虑更广泛的含义。

Here, I present some of the challenges of automation and some of the lessons we have learned from optimizing hundreds of sites with our service. I explain why it is dangerous to go down the list of YSlow (http://developer.yahoo.com/performance/rules.html) or Page Speed (http://code.google.com/speed/page-speed/docs/rules_intro.html) optimizations and attempt to automate them without thinking through the broader implications.

在 Torbit 的早期,我们构建了一个过滤器来缩小和组合 CSS 文件。很简单,对吧?可能会出现什么问题?令我们惊讶的是,这个“安全”过滤器破坏了数量惊人的网站。经过调查,我们发现许多网站的 CSS 无效或损坏,而网站所有者却没有注意到。要了解这是如何发生的,您需要考虑浏览器如何处理 CSS 错误。大多数浏览器一旦遇到语法错误就会停止解析 CSS 文件。当你盲目地组合 CSS 时,那些曾经位于文件底部(因此无关紧要)的错误现在位于一个大文件的中间。本来可能是一个不会影响任何事情的小问题,现在可能会破坏网站的整个布局。

In the early days of Torbit, we built a filter that minified and combined CSS files. Pretty simple, right? What could go possibly go wrong? To our surprise, this “safe” filter broke a surprising number of sites. After investigating, we discovered that many sites have invalid or broken CSS that had gone unnoticed by the site owners. To understand how this happens, you need to consider how browsers handle CSS errors. Most browsers will stop parsing a CSS file as soon as they run into a syntax error. When you blindly combine CSS, those errors that used to be at the bottom of a file (and therefore didn’t matter) are now in the middle of one big file. What may have been a small issue that didn’t affect anything, could now be breaking the entire layout of the site.

显而易见的解决方案是修复或删除有问题的 CSS 规则,而这正是我们所做的。我们首先“修复”了损坏的 CSS 文件,然后将它们组合起来。不幸的是,修复他们的 CSS 产生了意想不到的后果。我们没有考虑到开发人员一直在破解他们损坏的 CSS 的事实。事实上,在某些情况下,这些错误已经融入到他们的网站中,以至于删除它们通常会完全破坏网站的视觉外观。当修复某人的代码完全破坏了他们的网站时,你应该做什么?

The obvious solution was to fix or remove the offending CSS rule and that was exactly what we did. We “fixed” their broken CSS files first and then combined them. Unfortunately, fixing their CSS had unintended consequences. We hadn’t considered the fact that developers had been hacking around their broken CSS. In fact, in some cases these bugs had become so baked into their websites that removing them often completely destroyed the visual look of the site. What are you supposed to do when fixing someones code totally breaks their site?

最终,我们构建了一个智能 CSS 加载器,它允许我们在一个请求中下载网页的所有 CSS 文件,同时仍然将每个文件单独应用到 DOM。此方法不仅解决了 CSS 损坏的问题,还具有其他优点,例如非阻塞和尽可能利用 HTML5 localStorage。

Ultimately, we built a Smart CSS Loader, which allows us to download all of the CSS files for a web page in one request, while still applying each of the files to the DOM individually. This method not only solves the issues from broken CSS, but includes other benefits like being nonblocking and taking advantage of HTML5 localStorage whenever possible.

这里的教训是遵循原则,但不一定遵循具体规则。在 CSS 示例中,基本原则是减少 HTTP 请求,无论您是手动还是以自动方式进行优化,这个目标都是成立的。组合 CSS 文件的具体规则显然需要重新考虑,以便能够将该优化应用于任何网站而不破坏任何内容。

The lesson here is to follow the principles, but not necessarily the specific rules. In the CSS example, the underlying principle was to reduce HTTP requests, and this goal holds true whether you are doing the optimizations by hand or in an automated fashion. The specific rule of combining CSS files obviously needed some rethinking in order to be able to apply that optimization to any site without breaking anything.

回到基础的好处之一是,它可以让你开阔思路,找到其他性能优化,如果你只关注 YSlow 或 Page Speed 规则,你可能会错过这些优化。YSlow 或 Page Speed 均未提及 Torbit 的一些最佳优化。例如,将图像转换为 WebP 格式(http://torbit.com/blog/2011/04/05/torbit-adds-support-for-webp/)并将其提供给目标浏览器是一个很好的优化,可以显着减少有效负载,但它不在列表中。使用 localStorage 减少 HTTP 请求并改进缓存(http://torbit.com/blog/2011/05/31/localstorage-mobile-performance/)也没有提及。公平地说,这些工具主要面向开发人员,对于大多数企业来说,手动实施此类优化没有意义。事实上,这些优化手动完成既不简单也不有趣,这使得它们成为自动化的完美候选者。

One of the benefits of going back to the fundamentals is that it opens your mind to find other performance optimizations you would have missed if you had simply focused on the YSlow or Page Speed rules. Some of the best optimizations we have at Torbit aren’t mentioned by either YSlow or Page Speed. For example, converting images to WebP format (http://torbit.com/blog/2011/04/05/torbit-adds-support-for-webp/) and serving them for targeted browsers is a great optimization that can significantly minimize payload, but it isn’t on the list. Using localStorage to cut down on HTTP requests and improve caching (http://torbit.com/blog/2011/05/31/localstorage-mobile-performance/) is also not mentioned. To be fair, those tools are primarily for developers and optimizations like these don’t make sense for most businesses to implement by hand. The fact that these optimizations are neither easy nor fun to do by hand is what makes them such perfect candidates for automation.

如果您想实现自动化,那么重点关注基础知识非常重要。记住原则。让事物变得更小,将它们移得更近,缓存它们更长的时间,并更智能地加载它们。专注于最终目标,不要太拘泥于规则。

If you want to automate, it’s important to focus on the basics. Remember the principles. Make things smaller, move them closer, cache them longer, and load them more intelligently. Focus on the end objective and don’t get too caught up in the rules.

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/automating-website-performance/。最初发布于 2011 年 12 月 7 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/automating-website-performance/. Originally published on Dec 07, 2011.

第 8 章北京的前端单点故障

Chapter 8. Frontend SPOF in Beijing

史蒂夫 ·苏德斯

Steve Souders

当我为性能日历撰写这篇文章时,我正在北京的 Velocity China。由于这是我第二次来北京,我对防火墙后面的挑战做好了更好的准备。我知道我无法访问 Google、Facebook 和 Twitter 等流行的美国网站,但当我进行典型的冲浪时,我惊讶地发现有多少其他网站似乎被屏蔽了。

I’m at Velocity China in Beijing as I write this article for the Performance Calendar. Since this is my second time to Beijing I was better prepared for the challenges of being behind the Great Firewall. I knew I couldn’t access popular U.S. websites like Google, Facebook, and Twitter, but as I did my typical surfing I was surprised at how many other websites seemed to be blocked.

商业内幕

Business Insider

没过多久我就意识到问题出在前端 SPOF(http://www.stevesouders.com/blog/2010/06/01/frontend-spof/)——当前端资源(脚本、样式表或字体)文件)导致页面不可用。有些页面完全空白,例如 Business Insider(http://www.businessinsider.com图 8-1)。

It didn’t take me long to realize the problem was frontend SPOF (http://www.stevesouders.com/blog/2010/06/01/frontend-spof/)—when a frontend resource (script, stylesheet, or font file) causes a page to be unusable. Some pages were completely blank, such as Business Insider (http://www.businessinsider.com, Figure 8-1).

Firebug 的网络面板显示anywhere.js下载需要很长时间,因为它来自platform.twitter.com– 被防火墙阻止。知道脚本会阻止所有后续 DOM 元素的渲染,我们假设anywhere.js在 HEAD 中以阻塞模式加载。查看 HTML 源代码,我们发现这正是发生的情况:

Firebug’s Net Panel shows that anywhere.js is taking a long time to download because it’s coming from platform.twitter.com – which is blocked by the firewall. Knowing that scripts block rendering of all subsequent DOM elements, we form the hypothesis that anywhere.js is being loaded in blocking mode in the HEAD. Looking at the HTML source, we see that’s exactly what is happening:

<head>
...
<!-- Twitter Anywhere -->
<script src="https://platform.twitter.com/anywhere.js?id=ZV0...&v=1"
        type="text/javascript"></script>
<!-- / Twitter Anywhere -->
...

</head>

<body>
<head>
...
<!-- Twitter Anywhere -->
<script src="https://platform.twitter.com/anywhere.js?id=ZV0...&v=1"
        type="text/javascript"></script>
<!-- / Twitter Anywhere -->
...

</head>

<body>
由于 Twitter 脚本被阻止而导致可怕的“空白屏幕”

图 8-1。由于 Twitter 脚本被阻止而导致可怕的“空白屏幕”

Figure 8-1. The dreaded “blank white screen” due to a blocking Twitter script

如果anywhere.js异步加载(http://www.stevesouders.com/blog/2009/04/27/loading-scripts-without-blocking/),则不会发生这种情况。相反,由于anywhere.js使用旧方式加载<SCRIPT SRC=...,因此它会阻止后面的所有 DOM 元素,在本例中是页面的整个 BODY。如果我们等待足够长的时间,请求就会anywhere.js超时,页面就会开始渲染。请求多久会超时?查看 Business Insider 的“之后”屏幕截图,我们发现请求超时需要1 分 15 秒。用户盯着空白屏幕等待 Twitter 脚本的时间为 1 分 15 秒!(见图8-2。)

If anywhere.js had been loaded asynchronously (http://www.stevesouders.com/blog/2009/04/27/loading-scripts-without-blocking/) this wouldn’t happen. Instead, since anywhere.js is loaded the old way with <SCRIPT SRC=..., it blocks all the DOM elements that follow which in this case is the entire BODY of the page. If we wait long enough the request for anywhere.js times out and the page begins to render. How long does it take for the request to timeout? Looking at the “after” screenshot of Business Insider we see it takes 1 minute and 15 seconds for the request to timeout. That’s 1 minute and 15 seconds that the user is left staring at a blank white screen waiting for the Twitter script! (See Figure 8-2.)

Business Insider 最终在 1 分 15 秒后呈现

图 8-2。Business Insider 最终在 1 分 15 秒后呈现

Figure 8-2. Business Insider finally renders after 1 minute 15 seconds

科技网

CNET

CNET ( http://www.cnet.com/ ) 的体验略有不同;显示导航标题,但页面的其余部分被阻止呈现(图 8-3)。

CNET (http://www.cnet.com/) has a slightly different experience; the navigation header is displayed but the rest of the page is blocked from rendering (Figure 8-3).

查看 Firebug,我们发现wrapper.jsfromcdn.eyewonder.com处于“待处理”状态 — 这一定是被防火墙阻止的另一个域。根据渲染停止的位置,我们猜测wrapper.jsSCRIPT 标签紧接在导航标题之后,并以阻塞模式加载,从而阻止页面的其余部分渲染。HTML 证实这确实是正在发生的事情:

Looking in Firebug we see that wrapper.js from cdn.eyewonder.com is “pending”—this must be another domain that’s blocked by the firewall. Based on where the rendering stops, our guess is that the wrapper.js SCRIPT tag is immediately after the navigation header and is loaded in blocking mode thus preventing the rest of the page from rendering. The HTML confirms that this is indeed what’s happening:

<header>
...
</header>

<script src="http://cdn.eyewonder.com/100125/771933/1592365/wrapper.js"></script>

<div id="rb_wrap">

<div id="rb_content"> <div id="contentMain">
<header>
...
</header>

<script src="http://cdn.eyewonder.com/100125/771933/1592365/wrapper.js"></script>

<div id="rb_wrap">

<div id="rb_content"> <div id="contentMain">
CNET 渲染被 eyewonder.com 的广告屏蔽

图 8-3。CNET 渲染被 eyewonder.com 的广告屏蔽

Figure 8-3. CNET rendering is blocked by ads from eyewonder.com

奥莱利 雷达

O’Reilly Radar

每天,我都会访问 O'Reilly Radar 阅读 Nat Torkington 的 ( http://radar.oreilly.com/nat/index.html ) 四个短链接。通常,Nat 是 Radar 头版上众多故事之一,但从北京前往那里,显示的页面只有一个故事(图 8-4)。

Everyday, I visit O’Reilly Radar to read Nat Torkington’s (http://radar.oreilly.com/nat/index.html) Four Short Links. Normally Nat’s is one of many stories on the Radar front page, but going there from Beijing shows a page with only one story (Figure 8-4).

在第一个故事的底部应该有一个推文按钮。此按钮是通过widgets.js从中获取的脚本添加的platform.twitter.com,该脚本被防火墙阻止。如果是异步获取的话,这不会成为问题widgets.js,但遗憾的是,查看 HTML 会发现情况并非如此:

At the bottom of this first story there’s supposed to be a Tweet button. This button is added by the widgets.js script fetched from platform.twitter.com which is blocked by the Great Firewall. This wouldn’t be an issue if widgets.js was fetched asynchronously, but sadly a peek at the HTML shows that’s not the case:

<a href="...">评论</a>
&nbsp;| &nbsp;
<span class="social-counters">
<span class="retweet">
<a href="http://twitter.com/share" class="twitter-share-button"
   data-count="horizontal"
   data-url="http://radar.oreilly.com/2011/12/four-short-links-6-december-20-1.html"
   data-text="Four short links: 6 December 2011" data-via="radar"
   data-related="oreillymedia:oreilly.com">鸣叫</a>
<script src="http://platform.twitter.com/widgets.js"
   type="text/javascript"></script>
</span>
<a href="...">Comment</a>
&nbsp;|&nbsp;
<span class="social-counters">
<span class="retweet">
<a href="http://twitter.com/share" class="twitter-share-button"
   data-count="horizontal"
   data-url="http://radar.oreilly.com/2011/12/four-short-links-6-december-20-1.html"
   data-text="Four short links: 6 December 2011" data-via="radar"
   data-related="oreillymedia:oreilly.com">Tweet</a>
<script src="http://platform.twitter.com/widgets.js"
   type="text/javascript"></script>
</span>
O'Reilly 雷达渲染被 Twitter 小部件阻止。

图 8-4。O'Reilly 雷达渲染被 Twitter 小部件阻止。

Figure 8-4. O’Reilly Radar rendering is blocked by Twitter widget.

前端单点故障的原因

The Cause of Frontend SPOF

从这些示例中可能得出的一个结论是,前端 SPOF 特定于 Twitter 和 Eyewonder 以及其他一些第三方小部件。遗憾的是,前端 SPOF 可能是由任何第三方小部件引起的,甚至是主网站自己的脚本、样式表或字体文件引起的。

One possible takeaway from these examples might be that frontend SPOF is specific to Twitter and eyewonder and a few other third-party widgets. Sadly, frontend SPOF can be caused by any third-party widget, and even from the main website’s own scripts, stylesheets, or font files.

这些示例的另一个可能的收获可能是避免被防火墙阻止的第三方小部件。但防火长城并不是导致前端 SPOF 的唯一原因,它只是使其更容易重现。任何需要很长时间才能返回的脚本、样式表或字体文件都有可能导致前端 SPOF。这种情况通常发生在出现中断或其他类型的故障时,例如服务器过载,HTTP 请求在服务器队列中滞留太久,导致浏览器超时。

Another possible takeaway from these examples might be to avoid third-party widgets that are blocked by the Great Firewall. But the Great Firewall isn’t the only cause of frontend SPOF—it just makes it easier to reproduce. Any script, stylesheet, or font file that takes a long time to return has the potential to cause frontend SPOF. This typically happens when there’s an outage or some other type of failure, such as an overloaded server where the HTTP request languishes in the server’s queue for so long the browser times out.

前端 SPOF 的真正原因是以阻塞方式加载脚本、样式表或字体文件。我的前端 SPOF ( http://www.stevesouders.com/blog/2010/06/01/frontend-spof/ ) 博客文章中的表格显示了这种情况发生的时间。实际上,网站所有者可以控制其网站是否容易受到前端 SPOF 的影响。那么网站所有者该怎么办呢?

The true cause of frontend SPOF is loading a script, stylesheet, or font file in a blocking manner. The table in my frontend SPOF (http://www.stevesouders.com/blog/2010/06/01/frontend-spof/) blog post shows when this happens. It’s really the website owner who controls whether or not their site is vulnerable to frontend SPOF. So what’s a website owner to do?

避免前端单点故障

Avoiding Frontend SPOF

避免前端 SPOF 的最佳方法是异步加载脚本。许多流行的第三方小部件默认都会执行此操作,例如Google AnalyticsFacebookMeebo。Twitter 还为 O'Reilly Radar 应该使用的 Tweet 按钮提供了一个异步片段 ( https://dev.twitter.com/docs/tweet-button )。如果您使用的小部件不提供异步版本,您可以尝试 Stoyan 的社交按钮 BFF ( http://www.phpied.com/social-button-bffs/ ) 异步模式。

The best way to avoid frontend SPOF is to load scripts asynchronously. Many popular third-party widgets do this by default, such as Google Analytics, Facebook, and Meebo. Twitter also has an async snippet (https://dev.twitter.com/docs/tweet-button) for the Tweet button that O’Reilly Radar should use. If the widgets you use don’t offer an async version you can try Stoyan’s Social button BFFs (http://www.phpied.com/social-button-bffs/) async pattern.

另一个解决方案是将您的小部件包装在 iframe 中。这并不总是可行,但在上面的两个示例中,小部件最终在 iframe 中提供服务。从一开始就将它们放入 iframe 中可以避免前端 SPOF 问题。

Another solution is to wrap your widgets in an iframe. This isn’t always possible, but in two of the examples above the widget is eventually served in an iframe. Putting them in an iframe from the start would have avoided the frontend SPOF problems.

为了简洁起见,我重点关注脚本的解决方案。字体文件的解决方案可以在我的@font-face和性能( http://www.stevesouders.com/blog/2009/10/13/font-face-and-performance/)博客文章中找到。我不知道有多少关于异步加载样式表的研究。导致过多的回流和 FOUC ( http://bluerobot.com/web/css/fouc.asp/ ) 是需要解决的问题。

For the sake of brevity I’ve focused on solutions for scripts. Solutions for font files can be found in my @font-face and performance (http://www.stevesouders.com/blog/2009/10/13/font-face-and-performance/) blog post. I’m not aware of much research on loading stylesheets asynchronously. Causing too many reflows and FOUC (http://bluerobot.com/web/css/fouc.asp/) are concerns that need to be addressed.

呼吁采取行动

Call to Action

Business Insider、CNET 和 O'Reilly Radar 都有来自中国的访问者,但其页面的构建方式却带来了糟糕的用户体验,大部分(如果不是全部)页面被屏蔽超过一分钟。这不是 P2 前端 JavaScript 问题。这是一次停电。如果这些网站的后端服务器花了 1 分钟才发回响应,那么您可以打赌 Business Insider、CNET 和 O'Reilly 的 DevOps 团队不会睡觉,直到问题得到解决。那么为什么人们对前端 SPOF 的关注如此之少呢?

Business Insider, CNET, and O’Reilly Radar all have visitors from China, and yet the way their pages are constructed delivers a bad user experience where most if not all of the page is blocked for more than a minute. This isn’t a P2 frontend JavaScript issue. This is an outage. If the backend servers for these websites took 1 minute to send back a response, you can bet the DevOps teams at Business Insider, CNET, and O’Reilly wouldn’t sleep until the problem was fixed. So why is there so little concern about frontend SPOF?

前端单点故障并没有引起太多关注——考虑到它很容易导致网站瘫痪,它绝对没有得到应有的关注。原因之一是很难诊断。如果服务器响应时间超过 60 秒,许多监视器就会开始关闭。由于所有这些活动都在后端,因此更容易隔离原因。当客户端页面加载时间超过 60 秒时,寻呼机是否不会关闭?这很难相信,但也许情况确实如此。

Frontend SPOF doesn’t get much attention—it definitely doesn’t get the attention it deserves given how easily it can bring down a website. One reason is it’s hard to diagnose. There are a lot of monitors that will start going off if a server response time exceeds 60 seconds. And since all that activity is on the backend it’s easier to isolate the cause. Is it that pagers don’t go off when clientside page load times exceed 60 seconds? That’s hard to believe, but perhaps that’s the case.

也许这是跟踪页面加载时间的方式。如果您查看的是全球中位数甚至平均值,并且中国不是主要受众,那么当前端发生 SPOF 时,您的页面加载时间统计数据可能不会超过警报级别。或者,页面加载时间可能主要使用综合测试来跟踪,并且这些用户代理不会受到诸如防火墙之类的现实世界问题的影响。

Perhaps it’s the way page load times are tracked. If you’re looking at worldwide medians, or even averages, and China isn’t a major audience, your page load time stats might not exceed alert levels when frontend SPOF happens. Or maybe page load times are mostly tracked using synthetic testing, and those user agents aren’t subjected to real world issues like the Great Firewall.

网站所有者可以做的一件事是忽略前端 SPOF,直到它由未来的某些中断触发。快速计算表明这是一个可怕的选择。如果第三方小部件的正常运行时间为 99.99%,并且网站有五个非异步小部件,则前端 SPOF 的概率为 0.05%。如果我们将正常运行时间降低到 99.9%,则前端 SPOF 的概率就会上升到 0.5%。五个小部件可能很高,但请记住“第三方小部件”包括广告和指标。另外,网站自身的资源也会导致前端SPOF,从而导致这个数字更高。目前,网站平均包含 14 个脚本 ( http://httparchive.org/trends.php#bytesJS&reqJS ),如果未异步加载,其中任何一个脚本都可能导致前端 SPOF。

One thing website owners can do is ignore frontend SPOF until it’s triggered by some future outage. A quick calculation shows this is a scary choice. If a third-party widget has a 99.99% uptime and a website has five widgets that aren’t async, the probability of frontend SPOF is 0.05%. If we drop uptime to 99.9%, the probability of frontend SPOF climbs to 0.5%. Five widgets might be high, but remember that “third-party widget” includes ads and metrics. Also, the website’s own resources can cause frontend SPOF which brings the number even higher. The average website today contains 14 scripts (http://httparchive.org/trends.php#bytesJS&reqJS) any of which could cause frontend SPOF if they’re not loaded async.

前端单点故障是一个需要更多关注的现实问题。网站所有者应该使用异步代码片段和模式,监控真实的用户页面加载时间,并关注平均值以外的 95% 和标准偏差。执行这些操作将减轻用户遭受可怕的空白页的风险。链条的强度取决于其最薄弱的一环。您网站最薄弱的链接是什么?人们非常关注后端弹性。我敢打赌你最薄弱的环节是在前端。

Frontend SPOF is a real problem that needs more attention. Website owners should use async snippets and patterns, monitor real user page load times, and look beyond averages to 95th percentiles and standard deviations. Doing these things will mitigate the risk of subjecting users to the dreaded blank white page. A chain is only as strong as its weakest link. What’s your website’s weakest link? There’s a lot of focus on backend resiliency. I’ll wager your weakest link is on the frontend.

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/frontend-spof-in-beijing/。最初发布于 2011 年 12 月 8 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/frontend-spof-in-beijing/. Originally published on Dec 08, 2011.

第 9 章关于 YSlow 的一切

Chapter 9. All about YSlow

贝蒂

Betty Tso

自 2007 年以来,数以百万计的开发人员一直在使用 YSlow 来帮助他们找到使网页加载速度更快的方法。YSlow 分数一直是开发、QA 和生产阶段性能衡量的标准。

Since 2007, millions of developers have been using YSlow to help them find out ways to make their web pages load faster. YSlow score has been the standard for Performance measurement in dev, QA, and production stages.

YSlow 最初是 Steve Souders 在 Yahoo! 时作为一个书签开始的,很快就成为流行的 Firefox 扩展。在过去的一年里,Marcel Duran 构建了 YSlow Chrome 扩展、Opera 扩展和 Safari 扩展。为了同时支持移动设备和其他浏览器,YSlow 还于 2011 年 6 月作为书签提供,具有全新的闪亮代码和新架构。

YSlow first started as a bookmarklet by Steve Souders while at Yahoo!, and soon became a popular Firefox extension. Over the past year, Marcel Duran built a YSlow Chrome extension, Opera extension, and Safari extension. In order to also support mobile devices as well as other browsers, YSlow was also made available as a bookmarklet in June 2011 with fresh shiny code and new architecture.

在 2011 年 12 月 7 日在Velocity China上发表讲话时,我们的团队宣布发布YSlow for Command Line beta 版,感谢我们的 FE 技术负责人 Marcel。此版本利用 Node.js 并采用.har文件作为输入来生成 URL 的 YSlow 分数。有多种输出选项可用 - JSON、XML 和纯文本。用户还可以将结果通过管道传输到信标服务器,并 http://www.showslow.com/beacon/yslow/在图形 UI 中查看结果。有关完整的 YSlow 信标规范,请参阅用户指南

While speaking at Velocity China on December 7, 2011, our team announced the release of YSlow for Command Line beta, with courtesy to our FE tech lead, Marcel. This version leverages Node.js and takes .har files as input to generate YSlow score for a URL. Several output options are available—JSON, XML, and plain text. Users can also pipe the result to a beacon server, such as http://www.showslow.com/beacon/yslow/ and view the result in a graphical UI. For complete YSlow beacon spec, refer to the users’ guide.

2012 年 2 月,YSlow 在 Github 上开源,并有了新家:yslow.org。从那时起,YSlow 就成为了一个社区驱动的工具——在开源公告的前 24 小时内,就有 437 个观察者和 37 个分叉。

In February 2012, YSlow was open sourced on Github and given a new home: yslow.org. Since then, YSlow has become a community-driven tool—within the first 24 hours of the open source announcement, there were 437 watchers and 37 forks.

在 2012 年 4 月的亚马逊年度前端会议 ( http://wh.yslow.org/amazon-wdc ) 上发表讲话时,Marcel Duran 宣布推出适用于 PhantomJS 的 YSlow ( https://github.com/marcelduran/yslow/wiki/PhantomJS ),允许从实时 URL 进行页面性能分析的命令行脚本。

While speaking at Amazon’s annual frontend conference in April 2012 (http://wh.yslow.org/amazon-wdc), Marcel Duran announced YSlow for PhantomJS (https://github.com/marcelduran/yslow/wiki/PhantomJS), a command-line script that allows page performance analysis from live URLs.

图9-1中的图表记录了截至2011年12月9日过去几年YSlow的发展时间线。

The diagram in Figure 9-1 captures the timeline of YSlow development over the past few years as of December 9, 2011.

Y缓慢的时间线

图 9-1。Y缓慢的时间线

Figure 9-1. YSlow timeline

你可知道…?

Did you know…?

  • YSlow 还可以用作构建与浏览器交互的扩展的框架。有关代码示例,请参阅 Stoyan Stefanov 的文章: Web 测试框架

  • YSlow can also be used as a framework to build extensions that talk to browsers. Refer to Stoyan Stefanov’s article for code samples: Web Testing Framework.

  • 从v3.0.5开始,YSlow有了一个新功能:一键添加cdn到CDN自定义列表,允许用户在适用时将CDN添加到自定义列表。

  • Starting from v3.0.5, YSlow has a new feature: one-click-add-cdn to CDN custom list, which allows user to add CDNs to a custom list when applicable.

  • YSlow的社交功能可以让用户与Facebook和Twitter好友分享他们的YSlow分数;共享的链接指向getyslow.com上的 YSlow Scoremeter 。通过 Scoremeter,用户能够估计修复对 YSlow 得分结果的影响。以下是我在 Facebook 上分享的示例链接: example Scoremeter

  • YSlow’s social feature lets users share their YSlow score with Facebook and Twitter friends; the link shared points to YSlow Scoremeter on getyslow.com. With the Scoremeter, the user is able to estimate the impact of a fix on the resulted YSlow score. Here is a sample link shared on my Facebook: example Scoremeter.

  • 这是YSlow backlog 功能的完整列表。

  • Here is the full list of YSlow backlog features.

一如既往,我们很乐意听到您的反馈。您可以通过官方网站FacebookTwitter或通过电子邮件联系我们:ask@yslow.org。

As always, we would love to hear your feedback. You can reach us on the official site, Facebook, Twitter, or via email at ask@yslow.org.

特别感谢 Lauren Tsung,他在这篇文章中创建了信息图。Lauren 目前在 Yahoo! 担任交互设计师。系统工具团队。

Special thanks to Lauren Tsung, who created the infographic in this post. Lauren is currently working as an interactive designer in Yahoo! System Tools team.

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/all-aboout-yslow/。最初发布于 2011 年 12 月 9 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/all-aboout-yslow/. Originally published on Dec 09, 2011.

第 10 章高性能本机移动应用程序的秘密

Chapter 10. Secrets of High Performance Native Mobile Applications

以色列 尼尔

Israel Nir

自从 Steve Souders 出版了他的开创性著作《高性能网站》以来四年前,世界发生了巨大变化。网站变得更快,浏览器显着改进,用户开始期望顶级性能。在这四年中,一种新的面向客户端的应用程序诞生了,目前它很少受到性能社区的关注——原生移动应用程序。这些应用程序有其自身的挑战和机遇。幸运的是,它们与优秀的旧 Web 应用程序也有很多共同点。有一件事是肯定的,用户希望本机应用程序的执行速度与网站一样快,甚至更快。随着圣诞节热潮如火如荼地进行,用户势必对性能不佳的应用程序更加不能容忍,因此我认为现在是时候看看最畅销的移动应用程序的表现如何,同时也可以削弱应用程序的性能。我的节日礼物清单。

Since Steve Souders published his seminal book High Performance Web Sites four years ago, the world has changed considerably. Web sites became faster, browsers significantly improved and users started to expect top performance. During these four years, a new category of client-facing applications was born, which currently receives little attention from the performance community—native mobile applications. These applications have their own set of challenges and opportunities. Luckily, they also have a lot in common with good old web applications. One thing’s for certain, users expect native apps to perform as fast, if not faster, than web sites. With the Christmas rush in full swing, users are bound to be even less tolerant of poorly performing apps, so I figured it’s a good time to see how the top sellers’ mobile apps perform, and at the same time, also make a dent in my holiday gift list.

最影响应用程序性能的两个因素是什么?我不会讨论本机代码调整,因为这主要依赖于平台,并且可能会让你们大多数人昏昏欲睡。因此,让我们关注移动性能调整——改进应用程序在网络上的行为。考虑到这些应用程序最有可能遇到的网络条件(例如高延迟和低带宽),网络利用率的重要性就显得更加重要。

What are the two factors that most affect app performance? I’m not going to discuss native code tweaks, since this is predominantly platform-dependent and will probably put most of you to sleep. So let’s focus on mobile performance tuning—improving the application’s behavior over the network. The importance of network utilization is even greater considering the kind of network conditions these apps are most likely to encounter, such as high latency and low bandwidth.

为了分析移动应用程序的网络流量,您可以首先在计算机上设置临时 WiFi 网络,将移动设备连接到该网络并在计算机上运行数据包捕获。然后使用 Wireshark 等应用程序检查应用程序生成的流量,或将数据包捕获加载到 PcapPerf 等工具中。另一种选择是使用代理,例如 Fiddler 的 Charles Proxy,但请注意,它可能会影响您的应用程序的网络行为,例如限制并发连接数。我个人使用我公司的工具(Shunra vCat with Analytics,http://www.shunra.com/products/shunra-vcat)来捕获和分析应用程序的流量。这些工具还使我能够模拟移动网络,因此我可以更轻松地检测可能仅在各种移动网络(例如 3G)上出现的问题。

In order to analyze a mobile app’s network traffic, you can start by setting up an ad-hoc WiFi network on a computer, connect your mobile device to that network and run a packet capture on the computer. Then use an application such as Wireshark to examine the traffic generated by your application, or load the packet capture into a tool like PcapPerf. Another option is to use a proxy, such as Charles Proxy of Fiddler, but please be aware that it may impact your app’s network behavior, such as limiting the number of concurrent connections. Personally I use my company’s tools (Shunra vCat with Analytics, http://www.shunra.com/products/shunra-vcat) to capture and analyze the app’s traffic. These tools also enable me to emulate mobile networks, so it’s easier for me to detect problems that may only manifest on various mobile networks, such as 3G.

留意你的瀑布

Keep an Eye on Your Waterfalls

是时候开始认真购物了,让我们看看主要的移动零售玩家之一。从妈妈这个环游世界的旅行者开始,我想一套新的行李箱会很受欢迎。这里有很多选择——现在她最喜欢什么颜色?我有很多时间思考这个问题,因为这家零售商的 iPhone 应用程序需要相当长的时间才能加载。对 HTTP 瀑布的检查揭示了一条长长的资源菊花链,相互阻塞,持续 7.5 秒。请注意,在这种情况下,图像会阻止并行下载,这通常不会在 Web 应用程序中看到(图 10-1)。

Time to start some serious shopping, so let’s look at one of the major mobile retail players. Starting with Mom, the world traveller, I thought a new luggage set would be appreciated. Lots of choices here—now what’s her favorite color? I had lots of time to ponder this question, because this retailer’s iPhone app takes quite a while to load. Examination of the HTTP waterfall reveals a long daisy chain of resources blocking each other, lasting for 7.5 seconds. Notice that in this case, images are blocking parallel downloads, which is something you won’t typically see in a web app (Figure 10-1).

阻止下载

图 10-1。阻止下载

Figure 10-1. Blocking downloads

虽然 Web 开发人员可以通过一些简单的调整来启用并行下载,并信任浏览器制造商,但应由本机应用程序开发人员提出最佳的并发下载方案。我们的研究表明,即使在移动网络上,您也可以通过使用最多四个并行下载来获得性能提升,并且高级用户可以切换到 HTTP 管道来获得另一次速度提升。

While web developers can enable parallel downloads with a few simple tweaks and put their trust in browser makers, it’s up to the native app developer to come up with the optimal concurrent download scheme. Our research shows that even on mobile networks you can obtain a performance gain by using up to four parallel downloads, and advanced users can switch to HTTP pipelining to acquire another speed boost.

压缩这些资源

Compress Those Resources

在图 10-1的瀑布中,您可能会注意到第一个资源services.xml长 81KB,并且需要超过一秒的时间才能通过网络获取(阻止其后面的任何其他资源)。在那一秒中,仅下载文件就花费了 812 毫秒。查看响应标头可以发现它是未压缩发送的。如果经过压缩,其重量仅为 6KB,响应时间至少可节省半秒。显然,它不是使用此应用程序发送未压缩的唯一资源(图 10-2)。

In the waterfall in Figure 10-1, you may notice that the first resource, services.xml is 81KB long and takes more than a second to fetch over the network (blocking any other resources following it). Of that second, 812ms are spent just downloading the file. Looking at the response headers one can see that it was sent uncompressed. If it were compressed, it would have weighted only 6KB, saving at least half a second in response time. Obviously, it’s not the only resource sent uncompressed using this app (Figure 10-2).

未压缩的资源

图 10-2。未压缩的资源

Figure 10-2. Uncompressed resources

不要两次下载相同的内容

Don’t Download the Same Content Twice

这应该是理所当然的事情,但我们在许多 Android 和 iPhone 应用程序中观察到了这种性能反模式,值得指出。在实现本机应用程序时,开发人员有责任实现基本的缓存机制。仅设置 http 响应的缓存标头通常是不够的。以下是我在一家以手工制品闻名的电子商务网站的 iPhone 应用程序中寻找婴儿礼物时发生的情况(图 10-3)。

This should be a no brainer, but we have observed this performance anti-pattern in so many Android and iPhone apps that it’s worth pointing out. When implementing a native app, it’s the developer’s responsibility to implement a basic caching mechanism. Just setting the caching-headers of http responses is usually not enough. Here’s what happened when I was looking for a baby gift using the iPhone app of an e-commerce site known for its handmade items (Figure 10-3).

重复图像

图 10-3。重复图像

Figure 10-3. Duplicate images

可爱的宝贝,但同一张图像被下载了三次,这对于许多其他也被多次下载的图像来说是典型的。此外,某些图像在同一 TCP 会话中下载了多个实例。创建一个基本的缓存层(只要应用程序运行就将元素缓存在内存中)并不复杂。它极大地提高了绩效并凸显了您的专业水平。

Cute baby, but the same image was downloaded three times, and this was typical for many other images that were also downloaded multiple times. Moreover, some images downloaded more than one instance in the same TCP session. Creating a basic caching layer, one that caches elements in memory as long as the application is running, is not that complicated. It greatly improves performance and highlights your professionalism.

太多的阿德瑞娜·利玛会让你放慢脚步吗?

Can Too Much Adriana Lima Slow You Down?

厌倦了寻找通常的圣诞礼物,我启动了一家著名内衣零售商的应用程序,寻找,嗯,袜子来放入我女朋友的圣诞袜中。虽然我和其他人一样喜欢看 Adriana Lima,但下载她和其他 VS 模特的大图片实际上是相当痛苦的。令人惊讶的是,虽然我使用的是 iPhone,但我同时获得了 iPhone 和 iPad 版本的图像。iPad 图像显然没有针对我的小屏幕进行优化,浪费了半兆字节的流量。虽然这在有线网络上可能没问题,但在移动设备上却令人恼火(图 10-4)。

Tired of looking for the usual Christmas presents, I launched a famous lingerie retailer’s app, looking for, hmmm, stockings to put in my girlfriend’s Christmas stocking. Though I enjoy looking at Adriana Lima as much as the next guy, downloading huge images of her and the other VS models was actually quite painful. Surprisingly, although I was using an iPhone, I was getting both iPhone and iPad versions of the images. The iPad images were obviously not optimized for my small screen, and amounted to half a megabyte of wasted traffic. Although this might be OK over a wired network, it’s exasperating on a mobile (Figure 10-4).

iPad 版本的重复图像传送到 iPhone

图 10-4。iPad 版本的重复图像传送到 iPhone

Figure 10-4. Duplicate images with iPad versions served to iPhone

在过去的一年里,我们遇到了许多表现出类似性能失礼的应用程序。嘻哈飞行搜索应用 Hipmunk 下载了一个大数据文件(http://www.shunra.com/shunrablog/index.php/2011/03/21/being-slow-is-not-hip/)(后 650KB)压缩),将整个搜索结果包含在一个块中。最好将该文件拆分为几个较小的文件,其中一些文件可以异步下载。其他应用程序下载许多非常小的文件,可以轻松地将这些文件组合成更少的大文件,以避免由于移动网络的高延迟而造成的性能损失。

During the past year we have encountered many applications that exhibit similar performance faux-pas. Hipmunk, the hip flight search application, downloaded a big data file (http://www.shunra.com/shunrablog/index.php/2011/03/21/being-slow-is-not-hip/) (650KB after compression), containing the entire search results in one chunk. It would have been better to split that file into several smaller files, some of which could be downloaded asynchronously. Other applications download many very small files that could be easily combined into fewer larger files to circumvent a performance hit due to the high latency in mobile networks.

结语

Epilogue

这只是本机移动应用程序性能最佳实践的一个简短示例,表明性能良好的本机应用程序和网站的一些原则并没有那么不同。消除不必要的下载(就字节数和请求数而言),并通过利用并行化和异步下载来管理其余部分,以充分利用网络。在网站上,您可以将许多任务交给浏览器,而在本机应用程序中,这主要取决于您。性能调整的空间要大得多,但出错的空间也更大。因此,如果有一个重要的要点,那就是始终尽早测试您的应用程序,永远不要让性能碰运气。

This is just a short sample of performance best-practices for native mobile apps, indicating that some of the principals of well-performing native apps and websites are not that different. Eliminate unnecessary downloads (with respect to both the number of bytes and the number of requests), and manage the rest to make good use of the network by leveraging parallelization and asynchronous downloads. While with web sites you relegate many of those tasks to the browser, with native apps it’s mostly up to you. The room for performance tweaks is much larger, but so is the room for mistakes. Thus, if there’s one important takeaway, it’s to always test your apps early and never leave performance to chance.

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/secrets-of-high-performance-native-mobile-applications/。最初发布于 2011 年 12 月 10 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/secrets-of-high-performance-native-mobile-applications/. Originally published on Dec 10, 2011.

第 11 章 纯 CSS3 图像?嗯,也许稍后

Chapter 11. Pure CSS3 Images? Hmm, Maybe Later

马塞尔 ·杜兰

Marcel Duran

几位设计师在雅虎!要求将原始的 YSlow 徽标 PSD 用于 T 恤、海报、传单等宣传材料中。在今年发生的一些活动中,自从我加入卓越绩效团队后,我不知道它在哪里(http ://developer.yahoo.com/performance/)来处理 YSlow ( http://yslow.org/ ) 以及其他性能工具。为了解决这个问题,我决定从头开始重建它,因为它看起来并不那么复杂,问题是我是一个速度狂,而不是一个受到著名的纯CSS Twitter失败鲸启发的设计师(http:// www .subcide.com/articles/pure-css-twitter-fail-whale/)我用我的 CSS 肌肉来锻炼,显然专注于性能,为这些设计师提供一个可扩展的 YSlow 徽标(http://wh.yslow.org/css3-logo),让他们高兴,并可能拥有更小的图像负载在网络上使用。

Several designers while at Yahoo! requested that the original YSlow logo PSD be used in promotional materials such as t-shirts, posters, flyers, etc. in some events that occurred along this year, I had no idea where it was ever since I joined the Exceptional Performance Team (http://developer.yahoo.com/performance/) to take care of YSlow (http://yslow.org/) amongst other performance tools. In order to solve this problem I decided to rebuild it from scratch because it didn't seem so complicated, the problem was I was a speed freak, not a designer so inspired by the famous pure CSS Twitter fail whale (http://www.subcide.com/articles/pure-css-twitter-fail-whale/) I put my CSS muscles to work out focusing obviously on performance to provide those designers a scalable YSlow logo (http://wh.yslow.org/css3-logo) for their delight as well as potentially having a smaller image payload to be used on the Web.

挑战

The Challenge

从性能角度来看,这是一个有趣的挑战,因为我使用的代码越少,最终图像就越小,并且执行速度越快(渲染时间)。我的目标是实现一种通用的解决方案,以便在网络上广泛使用。除了性能之外,作为一名前端工程师,我还对 CSS3 如何帮助解决这个问题(可能是跨浏览器)以及所施加的限制感兴趣。我使用 Chrome 进行开发,所以我的第一个目标是首先在该浏览器上实现这一点,然后再使其跨浏览器兼容。渲染时间的基准测试也很容易,这是我在谈论 CSS3 背景渐变、边框半径、变换等时最关心的问题。

It was an interesting challenge from performance perspective since the less code I used, the smaller the final image would be and the faster it would perform (rendering time). My goal was to achieve a one-size-fits all solution to be used in the wild on the Web. Besides performance, as a front end engineer, I was also interested in how CSS3 could help solve this issue (cross-browser possibly) and the limitations imposed. I use Chrome for development, so my first goal was to make it happen for that browser first before making it cross-browser compatible. It was also easy to benchmark the rendering time, which was my main point of concern when talking about CSS3 background gradients, border radius, transformation, etc.

亲自动手进行 CSS3 烹饪

Getting My Hands Dirty with CSS3 Cooking

将 JSFiddle ( http://jsfiddle.net/ ) 作为我的游乐场确实很有帮助,因为这是一项反复试验的任务,而且我可以轻松跟踪版本并共享。Chrome 开发者工具:元素样式 ( http://code.google.com/chrome/devtools/docs/elements-styles.html#styles_edit ) 也发挥了重要作用,让我可以即时测试我的更改。

Having JSFiddle (http://jsfiddle.net/) as my playground was really helpful because it was a trial-and-error task, plus I could keep track of versions and share so easily. Chrome Developer Tools: Element Styles (http://code.google.com/chrome/devtools/docs/elements-styles.html#styles_edit) also played an important role letting me test my changes on-the-fly.

我的 JSFiddle Playground 位于http://jsfiddle.net/marcelduran/g7KvW/6/,您可以在其中查看代码和最终图像结果。CSS 和 HTML 代码(这里没有 JavaScript)也列在本章末尾。

My JSFiddle playground is available at http://jsfiddle.net/marcelduran/g7KvW/6/, where you can see the code and final image result. The CSS and HTML code (no JavaScript here) is also listed at the end of the chapter.

fiddle 的“结果”选项卡上的三个图像从上到下依次为:原始(250 像素宽度)图像、250 像素宽度的纯 CSS3 图像和 50% 宽度的纯 CSS3 图像。如果您在 Chrome 中加载 fiddle,预计会获得更好的结果。JSFiddle 还允许您分叉代码并应用您自己的更改,所以请成为我的客人。

The three images on the Result tab of the fiddle are from top-down: original (250px width) image, pure CSS3 with 250px width, and pure CSS3 with 50% width. If you load the fiddle in Chrome, you’re expected to get better results. JSFiddle also allows you to fork the code and apply your own changes, so be my guest.

使用 21 个 DOM 元素(按<style>块计算为 22 个),并通过使用不均匀 border-radius的几何图形、背景渐变使其闪亮、圆润且更真实,以及一些变换旋转足以最终获得没有红针的 YSlow 速度计徽标。我的第一次尝试是使用 DOM 元素边框来实现一个尖三角形(http://jonrohan.me/guide/css/creating-triangles-in-css/),它工作得很好,但不幸的是,由于百分比值,它没有缩放不允许(http://www.w3.org/TR/CSS2/box.html#value-def-border-widthborder-width。此外,背景渐变也不适用于边框,使其不像原始图像那样闪亮。当我碰壁时,我联系了我的前同事蒂埃里·科布伦茨 (Thierry Koblentz) ( http://twitter.com/thierrykoblentz ),他来救援了。他不仅把 CSS 当早餐吃,而且总是准备迎接 CSS 的挑战。令人印象深刻的是,他想出了一个非常好的解决方案,使用旋转移位的 DIV 隐藏不需要的部分 overflow:hidden,这使我能够通过背景渐变使其闪亮。作为一个优点,他还包括一个很好的过渡,可以在悬停时将针平滑地动画到最大值,这样的功能在常规 PNG/JPG 图像中不可用。

With 21 DOM elements (22 counting the <style> block) and by using uneven border-radius for geometries, background gradients to make it shiny, rounded, and more realistic, and some transform rotations were enough to finally get the YSlow speedometer logo without the red needle. My first attempt was to use DOM element borders to achieve a pointy triangle (http://jonrohan.me/guide/css/creating-triangles-in-css/) which works fine but unfortunately, it did not scale due to percentage values not being allowed (http://www.w3.org/TR/CSS2/box.html#value-def-border-width) on border-width. Also background gradients do not apply to borders either, making it not shiny as in the original image. When I hit this wall, I pinged my former co-worker Thierry Koblentz (http://twitter.com/thierrykoblentz), and he came to the rescue. He eats CSS not only for breakfast and is always up for CSS challenges. It was impressive, he came up with a very nice solution using rotated displaced DIVs hiding the undesired parts with overflow:hidden, which allowed me to make it shiny through background gradient. As a plus, he also included a nice transition that smoothly animates the needle to the max value when hovering, such feature is not available in regular PNG/JPG images.

在我实现了 Chrome 的目标之后,基本上使用 CSS3 的 W3C 规范和一些-webkit-前缀,是时候攻击其他浏览器了,所以我开始为 Internet Explorer添加其他供应商前缀,例如-moz--o--ms-和。filter

After I reached my goal for Chrome, using basically W3C specification for CSS3 and a few -webkit- prefixes, it was time to attack the other browsers, so I started adding other vendors prefixes like -moz-, -o-, -ms-, and filter for Internet Explorer.

跨浏览器结果

Cross-Browser Results

我对跨浏览器的结果非常失望,在花了一些时间试图找到一种方法来解决所有浏览器的问题而不增加 CSS 代码或添加更多 HTML 元素之后,我放弃了并扮演约翰列侬:“想象一下没有跨浏览器问题……”我想知道我们尊敬的表演日历策展人 ( http://twitter.com/stoyanstefanov ) 以前怎么没有想到过这样一首歌曲 ( http://www.youtube.com/watch?v= bPdkWJe9XH0 )。

I got very disappointed with the cross-browser results and after spending some time trying to figure out a way to fix things for all browsers without increasing the CSS code or adding more HTML elements, I gave up and played John Lennon: “Imagine there's no cross-browser issue…” I wonder how come our honorable Performance Calendar curator (http://twitter.com/stoyanstefanov) hasn't thought about such a song before (http://www.youtube.com/watch?v=bPdkWJe9XH0).

原始图像(PNG24)如图11-1所示。

The original image (PNG24) is shown in Figure 11-1.

PNG24 格式的原始 YSlow 徽标

图 11-1。PNG24 格式的原始 YSlow 徽标

Figure 11-1. original YSlow logo in PNG24 format

测试浏览器带注释的截图如图 11-2(非IE浏览器)和图11-3(不同IE版本)。这些图中的左列图像显示了使用特定于供应商的 CSS 时的结果,右列仅显示 W3C 有效的 CSS3。

The screenshots for the tested browsers with comments are shown in Figure 11-2 (non-IE browsers) and Figure 11-3 (different IE versions). The left column of images in those figures shows the result when using vendor-specific CSS and the right column is W3C-valid CSS3 only.

非 IE 浏览器中的结果

图 11-2。非 IE 浏览器中的结果

Figure 11-2. Results in non-IE browsers

IE 中的结果

图 11-3。IE 中的结果

Figure 11-3. Results in IE

有趣的是,仅限 W3C 的版本如何优雅地回退,这表明没有浏览器严格遵循规范,或者在撰写本文时规范尚未完全定义。即使不完全像原来的,除了一些例外,它们在某种程度上看起来都像一个速度计,除了呃,猜猜是谁?

Interesting how the W3C-only versions fall back gracefully, that shows no browser is strictly following specs or that the specs are not fully defined yet by the time of this writing. Even not fully resembling the original, with some exceptions, they all look like a speedometer gauge somehow, except er, guess who?

由于纯 CSS3 图像至少在 Chrome 上运行良好,我能够为设计师提供他们想要的东西,这足以让我开始我的性能基准测试。我知道有人可能会说,可以让它在具有更多 DOM 元素和/或更多 CSS 选择器/规则的其他浏览器上更好地工作,但这是一项耗时的任务,而且我在业余时间正在研究它,所以足够了CSS,让我们看看我们来这里的目的是什么。

With that pure CSS3 image working decently at least on Chrome, I was able to provide the designers what they were after and that was enough for me to start my performance benchmarking. I know one might argue it’s possible to make it work better on other browsers with more DOM elements and/or more CSS selectors/rules, but that was a time-consuming task and I was working on it during my spare time, so enough with CSS and let’s see what we are here for.

标杆管理

Benchmarking

为了比较真实的图像文件(http://wh.yslow.org/css3-logo-images)与CSS3生成的图像文件(http://wh.yslow.org/css3-logo-payload),我创建了一个少数页面每页仅包含一张图像,或者是真实文件 URL 和数据 URI ( http://en.wikipedia.org/wiki/Data_URI_scheme ) ( ) 或 CSS3(同一页面中的&ltimg src="...">HTML + CSS 块)。<style>

In order to compare real image files (http://wh.yslow.org/css3-logo-images) versus CSS3-generated ones (http://wh.yslow.org/css3-logo-payload), I created a few pages containing only one image per page, either real files URL and data URI (http://en.wikipedia.org/wiki/Data_URI_scheme) (&ltimg src="...">) or CSS3 (HTML + CSS <style> block in the same page).

有效载荷

Payload

在本地 Apache 服务器中托管这些页面 ( http://wh.yslow.org/css3-logo-payload ),我能够Accept-Encoding: gzip,deflate通过curl( http://curl.haxx.se/ ) 获取压缩和不压缩 ( ) 的它们),显然无需压缩即可获取 CSS3 和数据 URI 的内容长度以及真实图像 URL。在此基准测试中,压缩后的长度被用作每页的有效负载(图 11-3)。

Hosting these pages (http://wh.yslow.org/css3-logo-payload) in a local Apache server, I was able to fetch them with and without compression (Accept-Encoding: gzip,deflate) via curl (http://curl.haxx.se/), getting the content length for the CSS3 and data URI ones and the real images URL obviously without compression. The minified with compression lengths were used as payload per page in this benchmark (Figure 11-3).

渲染

Rendering

在这些页面的底部添加一个小脚本 ( http://wh.yslow.org/css3-logo-renderingsessionStorage ),使用( https://developer.mozilla.org/)以 1 秒的间隔重新加载页面 100 次 en/DOM/Storage#sessionStorage)用于计数,并使用Chrome 开发者工具:记录页面活动的时间轴面板,我能够导出记录的数据(http://wh.yslow.org/css3-logo-logs)。然后使用NodeJS 脚本,我可以仅提取和过滤与渲染活动相关的时间,清理样本的顶部和底部 5% 以删除一些噪声数据,然后获取平均值(http://wh.yslow . org/css3-logo-结果)以毫秒为单位(图11-4)。

Adding a small script at the bottom of these pages (http://wh.yslow.org/css3-logo-rendering) that reloads the page 100 times with 1 second interval, using sessionStorage (https://developer.mozilla.org/en/DOM/Storage#sessionStorage) for counting and with Chrome Developer Tools: Timeline Panel recording the page activity, I was able to export the logged data (http://wh.yslow.org/css3-logo-logs). Then with a NodeJS script, I could extract and filter only the timing related to the rendering activity, cleaning the top and bottom 5% of the sample to remove some noisy data, and then getting the average (http://wh.yslow.org/css3-logo-results) in milliseconds (Figure 11-4).

时间轴面板

图 11-4。时间轴面板

Figure 11-4. Timeline panel

对YSlow标志图像对比版本的分析如图11-5的表格所示,得出 图11-6的图表。该图表的数据可从 http://wh.yslow.org/css3-logo-data获取。

Analysis of the compared versions of YSlow logo image is shown in the table on Figure 11-5, which leads to the chart on Figure 11-6. The data for the chart is available at http://wh.yslow.org/css3-logo-data.

YSlow标志图版本对比

图 11-5。YSlow标志图版本对比

Figure 11-5. The compared versions of YSlow logo image

有效负载与渲染

图 11-6。有效负载与渲染

Figure 11-6. Payload versus Rendering

与常规图像(无论是 URL 还是数据 URI)相比,CSS3 生成的图像可以获得更小的有效负载。在这个 YSlow 徽标示例中,W3C 标准 CSS3 大约比 PNG24 图像版本小 34 倍。相同图像类型的数据 URI 版本在压缩后具有大致相同的有效负载。它们仅增加了几个字节,有趣的是,本例中 JPG 的内联版本比常规 JPG 图像文件稍小。

CSS3-generated images can achieve smaller payloads compared to regular images either URL or data URI ones. In this YSlow logo example, the W3C standard CSS3 is roughly 34 times smaller than PNG24 image version. Data URI versions of the same image type have around the same payload after being compressed. They get increased a few bytes only, interesting in this case that the inline version of JPG is slightly smaller than the regular JPG image file.

另一方面,CSS3 生成的图像渲染时间比常规图像更差,比 PNG24 版本慢约 6.5 倍。与常规图像文件版本相比,内联版本的渲染时间延长了一倍多。CSS3 W3C 标准版本的渲染速度比-webkit-具有所有浏览器供应商前缀的渲染速度快 2.5 倍。这并不一定意味着它真的更快,因为根据上面的屏幕截图结果,它们都没有触发所有 CSS 规则来根据原始版本正确渲染徽标。

On the other hand, CSS3-generated images rendering time is worse than regular images, being around 6.5 times slower than the PNG24 version. The inline versions more than double the rendering time when compared to their regular image file versions. The CSS3 W3C standard version rendering performed 2.5 times faster than -webkit- or the one with all browser vendors prefixes. This doesn’t necessarily mean it’s really faster because per the screenshots results above, none of them triggered all the CSS rules to render the logo properly according to the original version.

这些渲染时间是通过在页面上显示静态图像来测量的,没有任何悬停用户交互来使 CSS3 版本上的仪表指针动画化。在允许用户在视口上隐藏和显示或拖放图像从而触发多次重绘、回流和重新样式的情况下,这些数字可能会增加 ( http://www.phpied.com /rendering-repaint-reflowrelayout-restyle/)在这些 DOM 元素上。

These rendering times were measured just by displaying the static images on the page without any hovering user interaction that animates the gauge needle on CSS3 versions. These numbers would likely to be increased in the case-scenario where users are allowed to hide-and-show or drag-and-drop images over the viewport triggering several repaint, reflow, and restyle (http://www.phpied.com/rendering-repaint-reflowrelayout-restyle/) on these DOM elements.

从质量角度比较,带有所有前缀的 CSS3 或 -webkit-在 Chrome 上与 PNG24 版本相当,两者都有透明背景且没有像素化。CSS3 小了 34 倍,慢了 6.5 倍(以毫秒为单位),并且具有针对不同大小保持相同有效负载的优点,而 PNG 在从原始源(PSD 可用时)调整大小时会增加以避免质量损失,但用户无法在不截图的情况下将 CSS3 保存为图像。

Comparing apples-to-apples quality-wise, CSS3 with all prefixes or -webkit- on Chrome are comparable to the PNG24 version, both have transparent background and no pixelation. CSS3 is 34 times smaller, 6.5 times slower (in order of milliseconds) and has the advantage of keeping the same payload for different sizes, while PNG would increase when resized from the original source (PSD when available) to avoid quality loss, however users are not able to save CSS3 as an image without taking screenshots.

我们到了吗?

Are We There Yet?

并非如此,希望在不久的将来我们能够摆脱浏览器供应商的特定前缀,并拥有一个在所有浏览器中都同样有效的通用 CSS 解决方案。但即使我们做到了这一点,手动使用 DOM 元素和样式从头开始创建图像也是一项非常耗时的任务(SVG 就是为此设计的)。此类任务非常需要一种辅助绘图的插画工具,人们可以拖动贝塞尔曲线(http://en.wikipedia.org/wiki/B%C3%A9zier_curve),调整控制点以获得对应的点CSS3border-radius正确塑造几何线条的指令。

Not really, hopefully in the near future we'll get rid of browser vendors’ specific prefixes and have a one-size-fits-all CSS solution that works equally in all browsers. But even when we get there, it's a very time-consuming task to create images from scratch, using DOM elements and styles manually (SVG is designed for this). An illustrator tool to aid drawing is in high demand for such task where one could drag over Bézier curves (http://en.wikipedia.org/wiki/B%C3%A9zier_curve), adjusting the control points in order to get the correspondent directives to CSS3 border-radius shaping geometric lines properly.

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/pure-css3-images-hmm-maybe-later/。最初发布于 2011 年 12 月 11 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/pure-css3-images-hmm-maybe-later/. Originally published on Dec 11, 2011.

附录:代码清单

Appendix: Code Listings

您还可以在http://jsfiddle.net/marcelduran/g7KvW/6/上实时使用代码。

You can also play with the code live at http://jsfiddle.net/marcelduran/g7KvW/6/.

超文本标记语言

HTML

<img src="http://d.yimg.com/jc/ydn/speedometer.png">
<div class="ys" style="width:250px">
    <div class="a">
        <div class="b">
            <div class="c">
                <div class="d">
                    <div class="e">
                        <div class="f"></div>
                        <div class="g"></div>
                        <div class="t t1"></div>
                        <div class="t t2"></div>
                        <div class="t t3"></div>
                        <div class="t t4"></div>
                        <div class="t t5"></div>
                        <div class="t t6"></div>
                        <div class="t t7"></div>
                        <div class="p">
                            <div class="pw">
                                <div class="pi">
                                    <div class="pl"></div>
                                </div>
                                <div class="pi">
                                    <div class="pr"></div>
                                </div>
                            </div>
                        </div>
                    </div>
                </div>
            </div>
        </div>
    </div>
</div>

<div class="ys" style="width:50%">
    <div class="a">
        <div class="b">
            <div class="c">
                <div class="d">
                    <div class="e">
                        <div class="f"></div>
                        <div class="g"></div>
                        <div class="t t1"></div>
                        <div class="t t2"></div>
                        <div class="t t3"></div>
                        <div class="t t4"></div>
                        <div class="t t5"></div>
                        <div class="t t6"></div>
                        <div class="t t7"></div>
                        <div class="p">
                            <div class="pw">
                                <div class="pi">
                                    <div class="pl"></div>
                                </div>
                                <div class="pi">
                                    <div class="pr"></div>
                                </div>
                            </div>
                        </div>
                    </div>
                </div>
            </div>
        </div>
    </div>
</div>
<img src="http://d.yimg.com/jc/ydn/speedometer.png">
<div class="ys" style="width:250px">
    <div class="a">
        <div class="b">
            <div class="c">
                <div class="d">
                    <div class="e">
                        <div class="f"></div>
                        <div class="g"></div>
                        <div class="t t1"></div>
                        <div class="t t2"></div>
                        <div class="t t3"></div>
                        <div class="t t4"></div>
                        <div class="t t5"></div>
                        <div class="t t6"></div>
                        <div class="t t7"></div>
                        <div class="p">
                            <div class="pw">
                                <div class="pi">
                                    <div class="pl"></div>
                                </div>
                                <div class="pi">
                                    <div class="pr"></div>
                                </div>
                            </div>
                        </div>
                    </div>
                </div>
            </div>
        </div>
    </div>
</div>

<div class="ys" style="width:50%">
    <div class="a">
        <div class="b">
            <div class="c">
                <div class="d">
                    <div class="e">
                        <div class="f"></div>
                        <div class="g"></div>
                        <div class="t t1"></div>
                        <div class="t t2"></div>
                        <div class="t t3"></div>
                        <div class="t t4"></div>
                        <div class="t t5"></div>
                        <div class="t t6"></div>
                        <div class="t t7"></div>
                        <div class="p">
                            <div class="pw">
                                <div class="pi">
                                    <div class="pl"></div>
                                </div>
                                <div class="pi">
                                    <div class="pr"></div>
                                </div>
                            </div>
                        </div>
                    </div>
                </div>
            </div>
        </div>
    </div>
</div>

CSS

CSS

/* borders and background */
.ys .a {padding:1.5%;
    -moz-border-radius:100% 100% 0 0 / 166% 166% 0 0;
    -webkit-border-top-left-radius:1000em;
    -webkit-border-top-right-radius:1000em;
    border-radius:100% 100% 0 0 / 166% 166% 0 0;
    background: #b0b4b7;
    background: -moz-linear-gradient(left, #b0b4b7 8%, #3f3f40 54%);
    background: -webkit-gradient(linear, left top, right top, color-stop(8%,#b0b4b7), 
     color-stop(54%,#3f3f40));
    background: -webkit-linear-gradient(left, #b0b4b7 8%,#3f3f40 54%);
    background: -o-linear-gradient(left, #b0b4b7 8%,#3f3f40 54%);
    background: -ms-linear-gradient(left, #b0b4b7 8%,#3f3f40 54%);
    filter: progid:DXImageTransform.Microsoft.gradient(startColorstr='#b0b4b7',
     endColorstr='#3f3f40',GradientType=1);
    background: linear-gradient(left, #b0b4b7 8%,#3f3f40 54%);
}

.ys .b {padding:5% 5% 0 5%;
    -moz-border-radius:100% 100% 0 0 / 166% 166% 0 0;
    -webkit-border-top-left-radius:1000em;
    -webkit-border-top-right-radius:1000em;
    border-radius:100% 100% 0 0 / 166% 166% 0 0;
    background: #dadadc;
    background: -moz-linear-gradient(left, #dadadc 8%, #3a3a3c 54%);
    background: -webkit-gradient(linear, left top, right top, color-stop(8%,#dadadc), 
     color-stop(54%,#3a3a3c));
    background: -webkit-linear-gradient(left, #dadadc 8%,#3a3a3c 54%);
    background: -o-linear-gradient(left, #dadadc 8%,#3a3a3c 54%);
    background: -ms-linear-gradient(left, #dadadc 8%,#3a3a3c 54%);
    filter: progid:DXImageTransform.Microsoft.gradient( startColorstr='#dadadc', 
     endColorstr='#3a3a3c',GradientType=1 );
    background: linear-gradient(left, #dadadc 8%,#3a3a3c 54%);
}

.ys .c {padding:2.5% 2.5% 0 2.5%;
    -moz-border-radius:100% 100% 0 0 / 166% 166% 0 0;
    -webkit-border-top-left-radius:1000em;
    -webkit-border-top-right-radius:1000em;
    border-radius:100% 100% 0 0 / 166% 166% 0 0;
    background: #e1e4e5;
    background: -moz-linear-gradient(left, #e1e4e5 8%, #010204 54%);
    background: -webkit-gradient(linear, left top, right top, color-stop(8%,#e1e4e5), 
     color-stop(54%,#010204));
    background: -webkit-linear-gradient(left, #e1e4e5 8%,#010204 54%);
    background: -o-linear-gradient(left, #e1e4e5 8%,#010204 54%);
    background: -ms-linear-gradient(left, #e1e4e5 8%,#010204 54%);
    filter: progid:DXImageTransform.Microsoft.gradient( startColorstr='#e1e4e5', 
     endColorstr='#010204',GradientType=1 );
    background: linear-gradient(left, #e1e4e5 8%,#010204 54%);
}

.ys .d {padding:2%; background-color:#0c1c48;
    -moz-border-radius:100% 100% 0 0 / 166% 166% 0 0;
    -webkit-border-top-left-radius:1000em;
    -webkit-border-top-right-radius:1000em;
    border-radius:100% 100% 0 0 / 166% 166% 0 0;
}

.ys .e {padding:58% 5% 0 5%; position:relative; overflow:hidden;
    -moz-border-radius:100% 100% 0 0 / 166% 166% 0 0;
    -webkit-border-top-left-radius:1000em;
    -webkit-border-top-right-radius:1000em;
    border-radius:100% 100% 0 0 / 166% 166% 0 0;
    background: #394d97;
    background: -moz-linear-gradient(left, #394d97 8%, #282963 54%);
    background: -webkit-gradient(linear, left top, right top, color-stop(8%,#394d97), 
     color-stop(54%,#282963));
    background: -webkit-linear-gradient(left, #394d97 8%,#282963 54%);
    background: -o-linear-gradient(left, #394d97 8%,#282963 54%);
    background: -ms-linear-gradient(left, #394d97 8%,#282963 54%);
    filter: progid:DXImageTransform.Microsoft.gradient( startColorstr='#394d97', 
     endColorstr='#282963',GradientType=1 );
    background: linear-gradient(left, #394d97 8%,#282963 54%);
}

/* glare */
.ys .f {padding:50% 56%; position:absolute; top:11%; left:0;
    -moz-border-radius:166% 133% 0 0 / 166% 139% 0 0;
    -webkit-border-top-left-radius:166em 166em;
    -webkit-border-top-right-radius:133em 139em;
    border-radius:166% 133% 0 0 / 166% 139% 0 0;
    background: #2c3e90;
    background: -moz-linear-gradient(left, #2c3e90 8%, #120744 54%);
    background: -webkit-gradient(linear, left top, right top, color-stop(8%,#2c3e90), 
     color-stop(54%,#120744));
    background: -webkit-linear-gradient(left, #2c3e90 8%,#120744 54%);
    background: -o-linear-gradient(left, #2c3e90 8%,#120744 54%);
    background: -ms-linear-gradient(left, #2c3e90 8%,#120744 54%);
    filter: progid:DXImageTransform.Microsoft.gradient( startColorstr='#2c3e90', 
     endColorstr='#120744',GradientType=1 );
    background: linear-gradient(left, #2c3e90 8%,#120744 54%);
}

/* base */
.ys .g {padding:50% 74%; position:absolute; bottom:-135%; left:-16%;
    -moz-border-radius:100%;
    -webkit-border-radius:1000em;
    border-radius:100%;
    background: #99c1e2;
    background: -moz-linear-gradient(top, #99c1e2 1%, #7aaed9 3%, #2f6bb0 12%);
    background: -webkit-gradient(linear, left top, left bottom, color-stop(1%,#99c1e2), 
     color-stop(3%,#7aaed9), color-stop(12%,#2f6bb0));
    background: -webkit-linear-gradient(top, #99c1e2 1%,#7aaed9 3%,#2f6bb0 12%);
    background: -o-linear-gradient(top, #99c1e2 1%,#7aaed9 3%,#2f6bb0 12%);
    background: -ms-linear-gradient(top, #99c1e2 1%,#7aaed9 3%,#2f6bb0 12%);
    filter: progid:DXImageTransform.Microsoft.gradient( startColorstr='#99c1e2', 
     endColorstr='#2f6bb0',GradientType=0 );
    background: linear-gradient(top, #99c1e2 1%,#7aaed9 3%,#2f6bb0 12%);
}

/* ticks */
.ys .t {width:14%; height:6%; background-color:#e7e8e9; position:absolute;
    -moz-border-radius:30% / 100%;
    -webkit-border-radius:1000em;
    border-radius:30% / 100%;
}
.ys .t1 {left:7%; bottom:18%;}
.ys .t2 {left:11%; bottom:47%;
    -webkit-transform:rotate(30deg);
    -moz-transform:rotate(30deg);
    -o-transform:rotate(30deg);
    -ms-transform:rotate(30deg);
    transform:rotate(30deg);
}
.ys .t3 {left:24%; bottom:70%;
    -webkit-transform:rotate(60deg);
    -moz-transform:rotate(60deg);
    -o-transform:rotate(60deg);
    -ms-transform:rotate(60deg);
    transform:rotate(60deg);
}
.ys .t4 {left:44%; top:16%;
    -webkit-transform:rotate(90deg);
    -moz-transform:rotate(90deg);
    -o-transform:rotate(90deg);
    -ms-transform:rotate(90deg);
    transform:rotate(90deg);
}
.ys .t5 {right:24%; bottom:70%;
    -webkit-transform:rotate(-60deg);
    -moz-transform:rotate(-60deg);
    -o-transform:rotate(-60deg);
    -ms-transform:rotate(-60deg);
    transform:rotate(-60deg);
}
.ys .t6 {right:11%; bottom:47%;
    -webkit-transform:rotate(-30deg);
    -moz-transform:rotate(-30deg);
    -o-transform:rotate(-30deg);
    -ms-transform:rotate(-30deg);
    transform:rotate(-30deg);
}
.ys .t7 {right:7%; bottom:18%;}

/* pointer by @thierrykoblentz */
.ys .p {padding-bottom:52%; width:11%; position:absolute; left:50%; bottom:20%; 
 margin-left:-5%;
    -webkit-transform:rotate(20deg);
    -moz-transform:rotate(20deg);
    -o-transform:rotate(20deg);
    -ms-transform:rotate(20deg);
    transform:rotate(20deg);
    -webkit-transform-origin:bottom;
    -webkit-transition:all 200ms cubic-bezier(0.200, 0.000, 1.000, 0.360);
}
.ys:hover .p {
    -webkit-transform:rotate(90deg);
    -moz-transform:rotate(90deg);
    -o-transform:rotate(90deg);
    -ms-transform:rotate(90deg);
    transform:rotate(90deg);
}
.ys .pw {position:absolute; top:0; right:0; bottom:0; left:0;}
.ys .pw > :first-child {border-right:1px solid transparent; margin-right:-2px;}
.ys .p::after {content:""; position:absolute; width:97%; padding-bottom:92%; top:88%; 
 z-index:1;
    -moz-border-radius:100%;
    -webkit-border-radius:1000em;
    border-radius:100%;
    background: #ef4d58;
    background: -moz-linear-gradient(left, #ef4d58 10%, #ce1f2b 20%);
    background: -webkit-gradient(linear, left top, right top, color-stop(10%,#ef4d58), 
     color-stop(20%,#ce1f2b));
    background: -webkit-linear-gradient(left, #ef4d58 10%,#ce1f2b 20%);
    background: -o-linear-gradient(left, #ef4d58 10%,#ce1f2b 20%);
    background: -ms-linear-gradient(left, #ef4d58 10%,#ce1f2b 20%);
    filter: progid:DXImageTransform.Microsoft.gradient(startColorstr='#ef4d58', 
     endColorstr='#ce1f2b',GradientType=1);
    background: linear-gradient(left, #ef4d58 10%,#ce1f2b 20%);
}
.ys .pi {width:50%; height:100%; overflow:hidden; position:relative; float:left;}
.ys .pl, .ys .pr {position:absolute; width:200%; height:120%; left:50%;
    -webkit-transform:rotate(10deg);
    -moz-transform:rotate(10deg);
    -o-transform:rotate(10deg);
    -ms-transform:rotate(10deg);
    transform:rotate(10deg);
    background: #ef4d58;
    background: -moz-linear-gradient(left, #ef4d58 10%, #ce1f2b 20%);
    background: -webkit-gradient(linear, left top, right top, color-stop(10%,#ef4d58), 
     color-stop(20%,#ce1f2b));
    background: -webkit-linear-gradient(left, #ef4d58 10%,#ce1f2b 20%);
    background: -o-linear-gradient(left, #ef4d58 10%,#ce1f2b 20%);
    background: -ms-linear-gradient(left, #ef4d58 10%,#ce1f2b 20%);
    filter: progid:DXImageTransform.Microsoft.gradient( startColorstr='#ef4d58', 
     endColorstr='#ce1f2b',GradientType=1 );
    background: linear-gradient(left, #ef4d58 10%,#ce1f2b 20%);
}
.ys .pr {right:50%; left:auto;
    -webkit-transform:rotate(-10deg);
    -moz-transform:rotate(-10deg);
    -o-transform:rotate(-10deg);
    -ms-transform:rotate(-10deg);
    transform:rotate(-10deg);
}
/* borders and background */
.ys .a {padding:1.5%;
    -moz-border-radius:100% 100% 0 0 / 166% 166% 0 0;
    -webkit-border-top-left-radius:1000em;
    -webkit-border-top-right-radius:1000em;
    border-radius:100% 100% 0 0 / 166% 166% 0 0;
    background: #b0b4b7;
    background: -moz-linear-gradient(left, #b0b4b7 8%, #3f3f40 54%);
    background: -webkit-gradient(linear, left top, right top, color-stop(8%,#b0b4b7), 
     color-stop(54%,#3f3f40));
    background: -webkit-linear-gradient(left, #b0b4b7 8%,#3f3f40 54%);
    background: -o-linear-gradient(left, #b0b4b7 8%,#3f3f40 54%);
    background: -ms-linear-gradient(left, #b0b4b7 8%,#3f3f40 54%);
    filter: progid:DXImageTransform.Microsoft.gradient(startColorstr='#b0b4b7',
     endColorstr='#3f3f40',GradientType=1);
    background: linear-gradient(left, #b0b4b7 8%,#3f3f40 54%);
}

.ys .b {padding:5% 5% 0 5%;
    -moz-border-radius:100% 100% 0 0 / 166% 166% 0 0;
    -webkit-border-top-left-radius:1000em;
    -webkit-border-top-right-radius:1000em;
    border-radius:100% 100% 0 0 / 166% 166% 0 0;
    background: #dadadc;
    background: -moz-linear-gradient(left, #dadadc 8%, #3a3a3c 54%);
    background: -webkit-gradient(linear, left top, right top, color-stop(8%,#dadadc), 
     color-stop(54%,#3a3a3c));
    background: -webkit-linear-gradient(left, #dadadc 8%,#3a3a3c 54%);
    background: -o-linear-gradient(left, #dadadc 8%,#3a3a3c 54%);
    background: -ms-linear-gradient(left, #dadadc 8%,#3a3a3c 54%);
    filter: progid:DXImageTransform.Microsoft.gradient( startColorstr='#dadadc', 
     endColorstr='#3a3a3c',GradientType=1 );
    background: linear-gradient(left, #dadadc 8%,#3a3a3c 54%);
}

.ys .c {padding:2.5% 2.5% 0 2.5%;
    -moz-border-radius:100% 100% 0 0 / 166% 166% 0 0;
    -webkit-border-top-left-radius:1000em;
    -webkit-border-top-right-radius:1000em;
    border-radius:100% 100% 0 0 / 166% 166% 0 0;
    background: #e1e4e5;
    background: -moz-linear-gradient(left, #e1e4e5 8%, #010204 54%);
    background: -webkit-gradient(linear, left top, right top, color-stop(8%,#e1e4e5), 
     color-stop(54%,#010204));
    background: -webkit-linear-gradient(left, #e1e4e5 8%,#010204 54%);
    background: -o-linear-gradient(left, #e1e4e5 8%,#010204 54%);
    background: -ms-linear-gradient(left, #e1e4e5 8%,#010204 54%);
    filter: progid:DXImageTransform.Microsoft.gradient( startColorstr='#e1e4e5', 
     endColorstr='#010204',GradientType=1 );
    background: linear-gradient(left, #e1e4e5 8%,#010204 54%);
}

.ys .d {padding:2%; background-color:#0c1c48;
    -moz-border-radius:100% 100% 0 0 / 166% 166% 0 0;
    -webkit-border-top-left-radius:1000em;
    -webkit-border-top-right-radius:1000em;
    border-radius:100% 100% 0 0 / 166% 166% 0 0;
}

.ys .e {padding:58% 5% 0 5%; position:relative; overflow:hidden;
    -moz-border-radius:100% 100% 0 0 / 166% 166% 0 0;
    -webkit-border-top-left-radius:1000em;
    -webkit-border-top-right-radius:1000em;
    border-radius:100% 100% 0 0 / 166% 166% 0 0;
    background: #394d97;
    background: -moz-linear-gradient(left, #394d97 8%, #282963 54%);
    background: -webkit-gradient(linear, left top, right top, color-stop(8%,#394d97), 
     color-stop(54%,#282963));
    background: -webkit-linear-gradient(left, #394d97 8%,#282963 54%);
    background: -o-linear-gradient(left, #394d97 8%,#282963 54%);
    background: -ms-linear-gradient(left, #394d97 8%,#282963 54%);
    filter: progid:DXImageTransform.Microsoft.gradient( startColorstr='#394d97', 
     endColorstr='#282963',GradientType=1 );
    background: linear-gradient(left, #394d97 8%,#282963 54%);
}

/* glare */
.ys .f {padding:50% 56%; position:absolute; top:11%; left:0;
    -moz-border-radius:166% 133% 0 0 / 166% 139% 0 0;
    -webkit-border-top-left-radius:166em 166em;
    -webkit-border-top-right-radius:133em 139em;
    border-radius:166% 133% 0 0 / 166% 139% 0 0;
    background: #2c3e90;
    background: -moz-linear-gradient(left, #2c3e90 8%, #120744 54%);
    background: -webkit-gradient(linear, left top, right top, color-stop(8%,#2c3e90), 
     color-stop(54%,#120744));
    background: -webkit-linear-gradient(left, #2c3e90 8%,#120744 54%);
    background: -o-linear-gradient(left, #2c3e90 8%,#120744 54%);
    background: -ms-linear-gradient(left, #2c3e90 8%,#120744 54%);
    filter: progid:DXImageTransform.Microsoft.gradient( startColorstr='#2c3e90', 
     endColorstr='#120744',GradientType=1 );
    background: linear-gradient(left, #2c3e90 8%,#120744 54%);
}

/* base */
.ys .g {padding:50% 74%; position:absolute; bottom:-135%; left:-16%;
    -moz-border-radius:100%;
    -webkit-border-radius:1000em;
    border-radius:100%;
    background: #99c1e2;
    background: -moz-linear-gradient(top, #99c1e2 1%, #7aaed9 3%, #2f6bb0 12%);
    background: -webkit-gradient(linear, left top, left bottom, color-stop(1%,#99c1e2), 
     color-stop(3%,#7aaed9), color-stop(12%,#2f6bb0));
    background: -webkit-linear-gradient(top, #99c1e2 1%,#7aaed9 3%,#2f6bb0 12%);
    background: -o-linear-gradient(top, #99c1e2 1%,#7aaed9 3%,#2f6bb0 12%);
    background: -ms-linear-gradient(top, #99c1e2 1%,#7aaed9 3%,#2f6bb0 12%);
    filter: progid:DXImageTransform.Microsoft.gradient( startColorstr='#99c1e2', 
     endColorstr='#2f6bb0',GradientType=0 );
    background: linear-gradient(top, #99c1e2 1%,#7aaed9 3%,#2f6bb0 12%);
}

/* ticks */
.ys .t {width:14%; height:6%; background-color:#e7e8e9; position:absolute;
    -moz-border-radius:30% / 100%;
    -webkit-border-radius:1000em;
    border-radius:30% / 100%;
}
.ys .t1 {left:7%; bottom:18%;}
.ys .t2 {left:11%; bottom:47%;
    -webkit-transform:rotate(30deg);
    -moz-transform:rotate(30deg);
    -o-transform:rotate(30deg);
    -ms-transform:rotate(30deg);
    transform:rotate(30deg);
}
.ys .t3 {left:24%; bottom:70%;
    -webkit-transform:rotate(60deg);
    -moz-transform:rotate(60deg);
    -o-transform:rotate(60deg);
    -ms-transform:rotate(60deg);
    transform:rotate(60deg);
}
.ys .t4 {left:44%; top:16%;
    -webkit-transform:rotate(90deg);
    -moz-transform:rotate(90deg);
    -o-transform:rotate(90deg);
    -ms-transform:rotate(90deg);
    transform:rotate(90deg);
}
.ys .t5 {right:24%; bottom:70%;
    -webkit-transform:rotate(-60deg);
    -moz-transform:rotate(-60deg);
    -o-transform:rotate(-60deg);
    -ms-transform:rotate(-60deg);
    transform:rotate(-60deg);
}
.ys .t6 {right:11%; bottom:47%;
    -webkit-transform:rotate(-30deg);
    -moz-transform:rotate(-30deg);
    -o-transform:rotate(-30deg);
    -ms-transform:rotate(-30deg);
    transform:rotate(-30deg);
}
.ys .t7 {right:7%; bottom:18%;}

/* pointer by @thierrykoblentz */
.ys .p {padding-bottom:52%; width:11%; position:absolute; left:50%; bottom:20%; 
 margin-left:-5%;
    -webkit-transform:rotate(20deg);
    -moz-transform:rotate(20deg);
    -o-transform:rotate(20deg);
    -ms-transform:rotate(20deg);
    transform:rotate(20deg);
    -webkit-transform-origin:bottom;
    -webkit-transition:all 200ms cubic-bezier(0.200, 0.000, 1.000, 0.360);
}
.ys:hover .p {
    -webkit-transform:rotate(90deg);
    -moz-transform:rotate(90deg);
    -o-transform:rotate(90deg);
    -ms-transform:rotate(90deg);
    transform:rotate(90deg);
}
.ys .pw {position:absolute; top:0; right:0; bottom:0; left:0;}
.ys .pw > :first-child {border-right:1px solid transparent; margin-right:-2px;}
.ys .p::after {content:""; position:absolute; width:97%; padding-bottom:92%; top:88%; 
 z-index:1;
    -moz-border-radius:100%;
    -webkit-border-radius:1000em;
    border-radius:100%;
    background: #ef4d58;
    background: -moz-linear-gradient(left, #ef4d58 10%, #ce1f2b 20%);
    background: -webkit-gradient(linear, left top, right top, color-stop(10%,#ef4d58), 
     color-stop(20%,#ce1f2b));
    background: -webkit-linear-gradient(left, #ef4d58 10%,#ce1f2b 20%);
    background: -o-linear-gradient(left, #ef4d58 10%,#ce1f2b 20%);
    background: -ms-linear-gradient(left, #ef4d58 10%,#ce1f2b 20%);
    filter: progid:DXImageTransform.Microsoft.gradient(startColorstr='#ef4d58', 
     endColorstr='#ce1f2b',GradientType=1);
    background: linear-gradient(left, #ef4d58 10%,#ce1f2b 20%);
}
.ys .pi {width:50%; height:100%; overflow:hidden; position:relative; float:left;}
.ys .pl, .ys .pr {position:absolute; width:200%; height:120%; left:50%;
    -webkit-transform:rotate(10deg);
    -moz-transform:rotate(10deg);
    -o-transform:rotate(10deg);
    -ms-transform:rotate(10deg);
    transform:rotate(10deg);
    background: #ef4d58;
    background: -moz-linear-gradient(left, #ef4d58 10%, #ce1f2b 20%);
    background: -webkit-gradient(linear, left top, right top, color-stop(10%,#ef4d58), 
     color-stop(20%,#ce1f2b));
    background: -webkit-linear-gradient(left, #ef4d58 10%,#ce1f2b 20%);
    background: -o-linear-gradient(left, #ef4d58 10%,#ce1f2b 20%);
    background: -ms-linear-gradient(left, #ef4d58 10%,#ce1f2b 20%);
    filter: progid:DXImageTransform.Microsoft.gradient( startColorstr='#ef4d58', 
     endColorstr='#ce1f2b',GradientType=1 );
    background: linear-gradient(left, #ef4d58 10%,#ce1f2b 20%);
}
.ys .pr {right:50%; left:auto;
    -webkit-transform:rotate(-10deg);
    -moz-transform:rotate(-10deg);
    -o-transform:rotate(-10deg);
    -ms-transform:rotate(-10deg);
    transform:rotate(-10deg);
}

第 12 章 Android 中无用的背景图像下载

Chapter 12. Useless Downloads of Background Images in Android

埃里克 ·达斯佩特

Éric Daspet

让我们先快速提醒一下。在 CSS 中,“C”代表“级联”。您可以为元素属性指定许多相互冲突的规则:根据不同的权重和优先级,仅应用一个规则。

Let’s begin with a quick reminder. In CSS, the “C” stands for “cascading.” You may specify many conflicting rules for an element property: only one will be applied, based on different weights and priorities.

p { background-image: url(red.png) }
p { background-image: url(green.png) }
p.intro { background-image: url(yellow.png) }
p { background-image: url(red.png) }
p { background-image: url(green.png) }
p.intro { background-image: url(yellow.png) }

使用前面的代码和 a <p class=intro>,您的段落应该以黄色背景显示。浏览器很聪明。如果您没有任何其他<p>标签,他们只会下载黄色图像,即使您这样做,也永远不会下载红色图像。

With the previous code and a <p class=intro>, your paragraph should be displayed with a yellow background. Browsers are smart. If you don’t have any other <p> tag, they will only download the yellow image and even if you do, the red image will never be downloaded.

安卓问题

The Android Problem

嗯……这就是它应该如何工作的。WebKit 在 2010 年底修复了一个旧错误 ( https://bugs.webkit.org/show_bug.cgi?id=24223 ),导致它下载所有三个图像。在复杂的网站中,这可能是一个主要的性能故障。

Well… that’s how it should work. WebKit had an old bug fixed in late 2010 (https://bugs.webkit.org/show_bug.cgi?id=24223) that made it download all three images. In a complex website, this could be a major performance glitch.

我为什么要挖一个老bug?Chrome、Safari 和其他基于 webkit 的浏览器现在可能已经是最新的了,但我们的问题仍然存在于移动世界:Android。Android 2.x 设备中提供的几乎所有默认浏览器仍然受到此性能问题的影响。

Why am I digging up an old bug? Chrome, Safari, and other webkit-based browsers are probably up-to-date by now, but our problem still lives in the mobile world: Android. Almost every default browser shipped in Android 2.x device is still affected by this performance issue.

移动世界高度分散,更新也没有定期安排。看看Android智能手机,大多数设备仍然运行Android 2.2或Android 2.3。某些设备(例如 Nexus S)可能会在 2012 年第一季度更新至 Android 4.0。但遗憾的是,大多数设备都不会更新。多年来您仍然会找到 Android 2.2 和 2.3 设备。例如,在法国,三星 Galaxy S 取得了真正的成功,但它将运行 Android 2.3,并且仍将使用至少一年,也许两年。

The mobile world is highly fragmented and updates are not regularly scheduled. Looking at Android smartphones, the majority of devices is still running under Android 2.2 or Android 2.3. Some devices, like the Nexus S, will probably be updated to Android 4.0 in the first quarter of 2012. However, sadly, most of them won’t. You will still find Android 2.2 and 2.3 devices for years. For example, here in France, the Samsung Galaxy S was a true success but it will be running Android 2.3, and will still be used for at least one year, maybe two.

如果您的目标是移动受众,那么您现在就知道了您的性能敌人之一。如果你不……好吧,看来你有更大的问题需要处理。

If you target a mobile audience, you now know one of your performance enemies. If you don’t… well, it seems that you have bigger problems to deal with.

以及缺乏解决方案

And the Lack of Solution

您可能希望这篇笔记有一个圆满的结局,并有一个解决方案,或者至少是一些解决方法。你的期望是对的,但我无能为力。

You probably expect a happy ending to this note with a solution, or at least some workaround. You are right to expect this, but I won’t be able to help.

据我所知,没有解决方法,所以这里有两个指导方针:

As far as I know, there is no workaround, so here are two guidelines:

  • 仅将 CSS 中的背景图像添加到#id选择器。

  • Add background images in your CSS only to #id selectors.

  • 避免使用带有可能针对同一元素的背景图像的多个选择器(这意味着没有级联的样式表)。

  • Avoid using multiple selectors with background images that may target the same element (which means style sheet without cascade).

我知道,这些指导方针不可能毫无例外地遵循。这里的目的不是删除所有无用的下载,而是通过“尽力而为”的规则来减少它们,以帮助您的用户体验。至少,尽量避免对跨越整个网页的大背景图像使用级联。

I know, these guidelines are impossible to follow without exceptions. The purpose here is not to remove all useless downloads, but to reduce them with a “best effort” rule, in order to help your user experience. At the very least, try to avoid using the cascade for large background images that span the entire web page.

笔记

Note

要评论本章,请访问http://calendar.perfplanet.com/2011/useless-downloads-of-background-images-in-android/。最初发布于 2011 年 12 月 12 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/useless-downloads-of-background-images-in-android/. Originally published on Dec 12, 2011.

第 13 章 为网络计时

Chapter 13. Timing the Web

阿洛伊斯 ·赖特鲍尔

Alois Reitbauer

使用 YSlow、SpeedTracer 或 dynaTrace Ajax Edition 等浏览器插件来分析网页的加载行为变得非常容易。然而,一旦我们离开浏览器,故事就不同了。从真实用户那里获取详细数据要困难得多,而且只能达到一定的粒度。通常的方法是使用综合监控并从尽可能靠近最终用户的多个存在点执行测试。如果您从多个位置进行测量并涵盖大部分交易,那么这非常接近用户的感知性能。如果您对使用综合监控的优点和缺点的更多详细信息感兴趣,请推荐这篇博客文章(http://blog.dynatrace.com/2011/10/06/is-synthetic-monitoring-really-going-to -死/)。

Analyzing the loading behavior of web pages by using browser plug-ins like YSlow, SpeedTracer or dynaTrace Ajax Edition has become really easy. As soon as we leave the browser, the story however is a different one. Getting detailed data from real users is much harder and only possible to a certain level of granularity. The usual approach is to use synthetic monitoring and execute tests from a variety of points of presence as close to end users as possible. If you measure from many locations and cover most of your transactions, this comes pretty close to the users’ perceived performance. In case you are interested in more details on the pros and cons of using synthetic monitoring, recommend this blog post (http://blog.dynatrace.com/2011/10/06/is-synthetic-monitoring-really-going-to-die/).

然而,从用户的角度了解性能的最佳方法是在实际的浏览器中进行测量。虽然这听起来很简单,但事实证明这是一个挑战。仅使用浏览器中可用的信息来创建如图 13-1所示的瀑布图是不可能的。

The best way however to understand the performance from a user’s perspective is to measure in the actual browser. While this sounds very simple, it turns out to rather be a challenge. Creating a waterfall chart like the one on Figure 13-1 by just using information available in the browser simply is impossible.

尽管有像 Boomerang ( https://github.com/yahoo/boomerang ) 这样的免费库和商业产品可以提供一些此类信息,但这往往相当困难。实际上,出现的第一个问题是最难回答的:加载页面需要多长时间。让我们在这里说得更准确一些。从用户通过单击链接或键入 URL 启动页面加载到页面完全加载需要多长时间。这(有一些不准确)对于后续页面仍然可行,但对于起始页面则不可能。然而现在已经可以使用一小部分 JavaScript,如示例 13-1所示,它将计算从页面开始到加载的时间。虽然这提供了有关加载时间的提示,但我们没有看到 DNS 查找、连接的建立或重定向。因此,这些值可能反映也可能不反映用户感知的加载时间。

Although there are free libraries like Boomerang (https://github.com/yahoo/boomerang) and commercial products that can provide some of this information, it tends to be pretty tough. Actually of the first question that comes up is one of the hardest to answer: How long does it take to load a page. Let’s be more precise here. How long does it take from the time a user initiates the loading of a page by clicking a link or typing a URL until the page is fully loaded. This—with some inaccuracies—is still doable for subsequent pages however impossible for start pages. What however is already possible is today using a small portion of JavaScript as shown in Example 13-1, which will calculate the time from the beginning of the page until it is loaded. While this provides a hint on loading times, we do not see DNS lookups, the establishment of connection or redirects. So these values might or might not reflect the load time perceived by the user.

例13-1。用于测量页面加载时间的简单脚本

Example 13-1. Simple script for measuring page load time

<html>
  <head>
  <script>
    var start = new Date().getTime();
    function onLoad() {
       var now = new Date().getTime();
       var latency = now - start;
       alert("page loading time: " + latency);
     }
  </script>
  </head>
  <body onload="onLoad()">
  ...
<html>
  <head>
  <script>
    var start = new Date().getTime();
    function onLoad() {
       var now = new Date().getTime();
       var latency = now - start;
       alert("page loading time: " + latency);
     }
  </script>
  </head>
  <body onload="onLoad()">
  ...
显示浏览器中客户端活动的瀑布图

图 13-1。显示浏览器中客户端活动的瀑布图

Figure 13-1. Waterfall chart showing client activity in the browser

如果我们现在更进一步,还想对页面上的资源(如图像、CSS 或 JavaScript 文件)进行计时,那就会变得更加困难。我们可以使用示例 13-2中的代码片段来获取资源计时。对页面加载时间以及对此行为进行编码的工作量的影响是显着的。

If we now go even further and also want to time resources on the page like images, CSS, or JavaScript files, it gets even harder. We could use a code snippet like the one in Example 13-2 to get resource timings. The impact on the page load time as well as the effort for coding this behavior is significant.

例13-2。对负载行为有重大影响的时间资源简单方法

Example 13-2. Simple approach to time resources with significant impact on load behaviour

...
<script>
  downloadStart("myimg");
</script>
<img src="./myimg.jpg" onload="downloadEnd('myimg')" />
...
...
<script>
  downloadStart("myimg");
</script>
<img src="./myimg.jpg" onload="downloadEnd('myimg')" />
...

因此,从最终用户的角度获取性能信息确实很困难。然而,浏览器拥有所有这些信息。对于浏览器来说,公开它以便 JavaScript 可以轻松访问它不是最自然的事情吗?这正是 W3C Web 性能工作组 ( http://www.w3.org/2010/webperf/ ) 正在致力于的工作。该小组正在制定一套标准,使开发人员能够访问这些数据。使用示例 13-3中的一小段 JavaScript,我们可以轻松找出加载页面所需的时间。

So it is really hard to get performance information from an end user perspective. However, browsers have all this information. Wouldn’t it be the most natural thing for a browser to do to expose it so that it can be easily accessed by JavaScript. This is what the W3C Web Performance Working Group (http://www.w3.org/2010/webperf/) is working on. The group is working on a set of standards which enable developers to get access to this data. Using the short piece of JavaScript in Example 13-3 we can easily find out how long it took to load a page.

例13-3。使用导航计时来测量页面加载时间

Example 13-3. Using Navigation Timing to measure page load time

<html>
<head>
<script>
function onLoad() {
  var now = new Date().getTime();
  var page_load_time = now - performance.timing.navigationStart;
  alert("User-perceived page loading time: " + page_load_time);
}

</script>
</head>
<body onload="onLoad()">
...
<html>
<head>
<script>
function onLoad() {
  var now = new Date().getTime();
  var page_load_time = now - performance.timing.navigationStart;
  alert("User-perceived page loading time: " + page_load_time);
}

</script>
</head>
<body onload="onLoad()">
...

我们可以获得有关页面加载的更多详细信息,以了解页面加载过程的每个“阶段”花费了多长时间。如图13-2所示,我们可以了解解析主机名、建立连接、发送请求、等待响应花费了多长时间,或者执行处理程序花费了多长时间onLoad

We can get even more details on the loading of a page to understand how long each “phase” of the page-loading process took. As shown on Figure 13-2, we can find out how long it took to resolve the host name, establish a connection, send the request, and wait for the response or how long it took to execute onLoad handlers.

导航计时提供的详细计时

图 13-2。导航计时提供的详细计时

Figure 13-2. Detailed timings provided by Navigation Timing

此功能称为“导航计时”( http://w3c-test.org/webperf/specs/NavigationTiming/ ),已在最新的浏览器版本中实现。在移动设备上,Windows Mango 上的 IE9 也已经公开了此信息(图 13-3)。

This functionality, called Navigation Timing (http://w3c-test.org/webperf/specs/NavigationTiming/), is already implemented in latest browser versions. On mobile, IE9 on Windows Mango already exposes this information as well (Figure 13-3).

在桌面和移动浏览器中使用导航计时

图 13-3。在桌面和移动浏览器中使用导航计时

Figure 13-3. Using Navigation Timing in desktop and mobile browsers

尽管这是向前迈出的一大步,但我们仍然缺乏有关页面加载行为的大量详细信息。最重要的是,我们错过了有关下载资源的详细信息。从响应开始到事件发生之间发生的所有事情都onLoad 保持在黑匣子中。

Although this is a great step forward, we still lack a significant amount of details about page loading behavior. Most importantly, we miss details about downloaded resources. Everything that happens between the start of the response and the onLoad event stays a black box.

因此,资源计时 ( http://w3c-test.org/webperf/specs/ResourceTiming/ ) 规范定义了一个接口来访问有关资源的详细网络信息。就像初始页面一样,我们获得与主文档相同的信息粒度(图 13-4)。

Therefore the Resource Timing (http://w3c-test.org/webperf/specs/ResourceTiming/) specification defines an interface to access detailed networking information about resources. Just as with the initial page, we get the same granularity of information as for the main document (Figure 13-4).

资源计时提供的计时

图 13-4。资源计时提供的计时

Figure 13-4. Timings provided by resource timings

不幸的是,该规范尚未在当前浏览器中实现,但希望在明年年中之前在未来的浏览器版本中可用。我认为至少对于所有已经实现导航计时的浏览器来说都是如此。

Unfortunately this spec is not yet implemented in current browsers but hopefully will be available with future browser versions by mid next year. I think this is true at least for all the browsers that already implement Navigation Timing.

因此,这使我们能够深入了解应用程序的网络行为;然而,我们仍然怀念的是在页面上对自定义事件进行计时的能力。让我们看一个简单的例子。假设我们想要测量某些内容何时在页面上可见。这就是用户计时规范 ( http://w3c-test.org/webperf/specs/ResourceTiming/ ) 发挥作用的地方。用户计时允许我们测量离散的时间点,例如从导航开始到在页面上显示某些内容所花费的时间。例 13-4中的代码片段展示了这段代码的样子。

So this gives us great insight into the networking behavior of the application; what we still miss however is the ability to time custom events on a page. Let’s look at a simple example. Assume we want to measure when certain content is visible on the page. This is where the User Timing specification (http://w3c-test.org/webperf/specs/ResourceTiming/) comes into play. User Timing allows us to measure discrete points in time, like how long it took from navigation start to the displaying of certain content on a page. The snippet in Example 13-4 shows how this code might look like.

例13-4。使用用户计时测量页面加载中的自定义点

Example 13-4. Measuring a custom point in page load using User Timing

var perf = window.performance;
perf.measure("customLoad");
var customLoadTime = perf.getMeasures("customLoad")[0];
var perf = window.performance;
perf.measure("customLoad");
var customLoadTime = perf.getMeasures("customLoad")[0];

因此,将所有这些放在一起,我们就有了一个很好的方法来记录页面上发生的所有重大事件的时间。因为使用所有这些不同的 API 最终可能会有点混乱,所以还会有一个通用接口来访问所有这些数据。这就是性能时间线 ( http://w3c-test.org/webperf/specs/UserTiming/ ) 的内容。时间线提供了一个统一的界面来访问所有与性能相关的信息。

So putting all this together, we have a good way to time all major events that happen on a page. Because using all these different APIs might end up being a bit confusing, there will also be a common interface to access all this data. That’s what the Performance Timeline (http://w3c-test.org/webperf/specs/UserTiming/) is about. The timeline provides a unified interface to access all performance-related information.

结论

Conclusion

虽然尚未完全实施,但用于计时网页的新 W3C 规范提供了一种直接在用户浏览器中访问性能信息的简单方法。在未来的浏览器版本中,我们将能够删除目前用于获取最终用户计时信息的许多神奇代码。

While they are not fully implemented yet, the new W3C specifications for timing web pages provide an easy way to access performance information right in the user’s browsers. In future browser versions we will be able to drop a lot of the magic code used today to get end user timing information.

然而,一个尚未解答的问题是如何将这些数据发送回服务器。目前有两种可能的方法。我们可以使用信标(搭载监控数据的 HTTP GET 请求)或 XHR。在大多数情况下,这两种方法的效果都可以接受。在事件中发送数据存在一些问题onBeforeUnload 。因此,如果我们将所有内容放在一起并添加服务器端基础设施,这就是我们可以收集的有关最终用户的数据。

A question that however stays unanswered is how this data is sent back to the server. Currently there are two possible approaches. We can use beacons (HTTP GET request that piggyback the monitoring data) or XHRs. Both approaches work acceptably well in most cases; there are some issues with sending data in the onBeforeUnload event. So if we put everything together and add server-side infrastructure this is the data we can collect about our end users.

作为最后的预览,我可以向您展示使用现代技术我们将获得什么级别的粒度。图 13-5中的信息是通过我们自己的监控收集的,使用一种将导航和资源计时“向后移植”到现有浏览器中。

As a final sneak peek, I can show you what level of granularity we will get using modern technology. The information on Figure 13-5 is collected by our own monitoring using a kind of “backport” of Navigation and Resource Timing into existing browsers.

显示缓慢第三方的博客页面的基于最终用户的性能数据

图 13-5。显示缓慢第三方的博客页面的基于最终用户的性能数据

Figure 13-5. End-user-based performance data for a blog page showing slow third parties

如果您今天想尝试新的 API,只需点击此链接 ( http://blog.dynatrace.com/samples/bookmark.html ) 并检查加载此页面需要多长时间。您可以使用这个简单的书签来获取您感兴趣的任何页面的计时信息。

If you want to try it the new APIs today, just follow this link (http://blog.dynatrace.com/samples/bookmark.html) and check how long it took to load this page. You can use this simple bookmarklet to get timing information for any page you are interested in.

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/timing-the-web/。最初发布于 2011 年 12 月 13 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/timing-the-web/. Originally published on Dec 13, 2011.

第 14 章我看到 HTTP

Chapter 14. I See HTTP

斯托扬 ·斯特凡诺夫

Stoyan Stefanov

女士们先生们,男孩们女孩们。向 打个招呼icy

Ladies and gentlemen, boys and girls. Say hello to icy.

冰冷的

icy

它是一个 iOS 应用程序,可让您调试 HTTP。它类似于 HTTPWatch ( http://httpwatch.com/ ) 或 WebPagetest ( http://webpagetest.org/ ),但适用于移动设备。与 blaze.io 的 mobitest ( http://www.blaze.io/mobile/ ) 类似,但在你的口袋里,它可以与 3G、Edge 配合使用(因为这些可以具有与 WiFi 不同的特性和运营商优化),并且还可以让你检查登录后的页面。

It’s an iOS app that lets you debug HTTP. It’s like HTTPWatch (http://httpwatch.com/) or WebPagetest (http://webpagetest.org/), but for mobile. Like blaze.io’s mobitest (http://www.blaze.io/mobile/), but in your pocket, it works with 3G, Edge (as these can have different characteristics and carrier optimizations than WiFi), and also lets you inspect pages behind login.

一些细节

Some details

  • UIWebView加载您想要的页面并提供一个NSURLCache类,该类记录 iOS 网络层向其抛出的任何内容。

  • It’s a UIWebView that loads the page you want and provides a NSURLCache class, which logs whatever the iOS networking layer throws at it.

  • 它位于 github 上(https://github.com/stoyan/icy)。请注意,这是我第一次尝试 iOS 和 Obj-C,因此代码质量可能很糟糕。许可证是公共领域的,因为我不太了解其他的。

  • It’s on github (https://github.com/stoyan/icy). Note that this is my very first attempt at iOS and Obj-C so the code quality is probably atrocious. License is public domain, because I don’t really understand the others.

  • 名称是icy,因为它是 iOS,并且法律规定应用程序名称必须以“i”为前缀。另外(至少对我东欧人的耳朵来说),“icy”听起来像“I see”(在聊天中拼写为“ic”),并且是(用最诡异的声音说)“I see... HTTPeee”的开头。

  • The name is icy, because it’s iOS and it’s the law that app names be prefixed with an “i”. Also (to my Eastern European ear at least), “icy” sounds like “I see” (spelled “ic” in chats) and is the beginning of (said with spookiest of voices) “I see… HTTPeee.”

演练

Walkthrough

千里之行,始于轻轻一按。如图14-1所示,该图标是默认/缺失的图标。(谁在乎图标?)如果你足够集中注意力,你可能会说服自己白色图标实际上是有意义的,它就像雪,或者,你有它,冰。

A journey of a thousands miles begins with a single tap. As you can see in Figure 14-1, the icon is the default/missing icon. (Who cares about icons?) If you focus hard enough you may convince yourself that the white icon actually makes sense, it’s like snow, or, there you have it, ice.

应用程序图标

图 14-1。应用程序图标

Figure 14-1. App icon

然后我们就得到了(图14-2)等待UIWebView加载的页面和地址栏。就在那里,您已经看到了该应用程序的第一个问题 —UIWebView并不是真正的 iOS Safari。它的行为可能不同,甚至有不同的 JavaScript 引擎。但这已经是我们所能得到的最接近的了。

What we have then (Figure 14-2) is a UIWebView waiting to load a page and an address bar. Right there you already see the first problem with the app—UIWebView is not really iOS Safari. It may act differently and even have a different JavaScript engine. But it’s as close as we can get.

“浏览器”

图 14-2。“浏览器”

Figure 14-2. The “browser”

敲击、打字、敲击、打字……(见图14-3。)

Tapping, typing, tapping, typing… (See Figure 14-3.)

导航到页面

图 14-3。导航到页面

Figure 14-3. Navigating to a page

哦,看,页面已加载!现在,让我们揭开面纱,一睹这些奇思妙想的背后到底是什么(图 14-4)。

Oh look, a page is loaded! Now let’s remove the veil and peek to see what’s underneath all that fanciness (Figure 14-4).

页面已加载,等待检查

图 14-4。页面已加载,等待检查

Figure 14-4. Page loaded, waiting to be inspected

哈!要求!(见图14-5。)

Ha! Requests! (See Figure 14-5.)

页面组件列表

图 14-5。页面组件列表

Figure 14-5. List of page components

正如你所看到的,我从 webkit 项目中窃取了 JS/CSS/HTML 图标。如果页面组件看起来像图像(有Content-Type: image/*),您会看到一个小缩略图。

As you can see, I stole the JS/CSS/HTML icons from the webkit project. And if a page component looks like an image (has Content-Type: image/*), you see a little thumbnail.

您会看到此页面发出的请求数。

You see the number of requests that this page made.

此外,每个请求行都是一个指向更多详细信息的链接(图 14-6)。

Also each request line is a link to more details (Figure 14-6).

详细信息分为“元”、“请求标头”和“响应标头”。Meta包含 URL 和持续时间等一般信息。

The details are split into “Meta,” “Request headers,” and “Response Headers.” Meta contains general information such as URL and duration.

组件详细信息视图

图 14-6。组件详细信息视图

Figure 14-6. Component details view

“但是持续时间准确吗?” 您可能会以挑剔的读者和性能极客的身份提出这样的问题。据我所知,这是相当准确的。

“But is the duration accurate?” you may ask as a critical reader and a performance geek. To the best of my knowledge it’s pretty accurate.

图 14-7显示了我们所了解和喜爱的请求标头。

Figure 14-7 shows request headers, as we know and love them.

请求标头

图 14-7。请求标头

Figure 14-7. Request headers

如果文本被截断,您可以再次点击,获取标题值的全文(图14-8)。

If the text is cut off, you can tap again and get the full text of the header value (Figure 14-8).

标题的全文

图 14-8。标题的全文

Figure 14-8. Full text of a header

在请求/响应标头之后,我们可以预览组件的外观。如果是图像,您会得到一个小缩略图,单击该缩略图可以获得更大的图像(图 14-9图 14-10)。

After request/response headers, what we have is a preview of what the component looks like. If it’s an image, you get a little thumbnail that you can click to get a bigger image (Figure 14-9, Figure 14-10).

组件预览(图像)

图 14-9。组件预览(图像)

Figure 14-9. Component preview (images)

组件全视图(图像)

图 14-10。组件全视图(图像)

Figure 14-10. Component full view (images)

如果组件是文本,您将获得前几个字符,然后点击以获取真正的交易。(图14-11图14-12

If the component is text, you get the first few characters and then tap for the real deal. (Figure 14-11, Figure 14-12)

组件预览(文本组件,例如CSS、JS)

图 14-11。组件预览(文本组件,例如CSS、JS)

Figure 14-11. Component preview (text components, e.g. CSS, JS)

组件全视图(文本组件)

图 14-12。组件全视图(文本组件)

Figure 14-12. Component full view (text components)

这就是目前的全部内容。

And that’s all there is for now.

托多斯

Todos

有一些立即要做的事情(我很乐意接受任何帮助)和一些更一般的前进想法。

There are a few immediate todos (for which I’d gladly take any help) and some more general ideas for going forward.

首先,NSURLCachehttp://developer.apple.com/library/mac/#documentation/Cocoa/Reference/Foundation/Classes/NSURLCache_Class/Reference/Reference.html)是检查网络的最佳/唯一方法吗?起初我对 iOS SDK 没有提供检查流量的 API 感到有点失望。但后来我看到 Patrick Meenan 需要做些什么才能使 WebPagetest 发生(http://calendar.perfplanet.com/2011/webpagetest-internals/),所以我想需要一点黑客攻击和方法调整(http://www.cocoadev ) .com/index.pl?MethodSwizzling)可能比较合适。这可能会降低应用程序进入应用程序商店的机会。

First of all, is the NSURLCache (http://developer.apple.com/library/mac/#documentation/Cocoa/Reference/Foundation/Classes/NSURLCache_Class/Reference/Reference.html) the best/only way to inspect the network? At first I was a little disappointed that the iOS SDK doesn’t provide APIs to inspect the traffic. But then I saw what Patrick Meenan needs to do to make WebPagetest happen (http://calendar.perfplanet.com/2011/webpagetest-internals/), so I guess a little hacking and method swizzling (http://www.cocoadev.com/index.pl?MethodSwizzling) might be appropriate. Which might lower the chances of the app ever hitting the app store.

无论如何,NSURLCache这是在本机/混合应用程序中实现自己的缓存的一种方法。这本身就是构建 iOS 应用程序时需要了解的一个很好的优化。您创建一个扩展类NSURLCache并宣布它:

Anyway, NSURLCache is a way to implement your own caching in your native/hybrid app. Which in and of itself is a nice optimization to know about when building iOS apps. You create a class extending NSURLCache and announce it:

[NSURLCachesetSharedURLCache:mycache];
[NSURLCachesetSharedURLCache:mycache];

然后每次网络视图要发出请求时,它都会问你的班级“嘿,找到了 google.com/logo.png 的东西吗?” 而且每次下载组件时,它都会传递到您的类,以便您可以存储它。

And then every time the web view is about to make a request, it will ask your class “hey, got that google.com/logo.png thing?” And also every time a component is downloaded, it will be passed to your class so you can store it.

这就是icy应用程序的构建方式,只是我不存储和返回文件,而是记录任何发生的事情。

And this is how the icy app was built, only instead of storing and returning files, I just log anything that comes my way.

而这种“任何发生在我身上的事情”就是内省不完整的地方。有时,网络层不会调用我的方法来表示新的响应已经到达。被认为不可缓存的回复可能永远不会到达我的NSURLCache孩子。在这些情况下,您会在应用程序中看到我收到了请求,但没有匹配的响应。在图 14-13的示例中,它是 Facebook 的 Like 按钮的 PHP。白色图标意味着我没有得到Content-Type要检查的响应标头。

And this “anything that comes my way” is where incompleteness of introspection comes in. Sometimes the networking layer doesn’t call my method to say that a new response has arrived. Responses that are thought of as uncacheable may never reach my NSURLCache child. In these cases, you see in the app that I got the request, but no response for it to match. In the example in Figure 14-13 it’s the PHP for Facebook’s Like button. The white icon means I didn’t get a Content-Type response header to inspect.

缺少响应信息

图 14-13。缺少响应信息

Figure 14-13. Missing response information

这就是为什么我认为重新获取对于检查我们没有得到响应的 URL 可能是一个好主意。我们可以单独提出有意的请求并获得响应,我们不依赖于NSURLCacheUIWebView。这就是想法,也是目前的待办事项(图 14-14)。

That’s why I thought a refetch might be a good idea for inspecting URLs that we didn’t get a response for. We can make a separate deliberate request and get the response, we don’t rely on the NSURLCache and UIWebView. That’s the idea and it’s a todo currently (Figure 14-14).

重新获取

图 14-14。重新获取

Figure 14-14. Refetch

另一件事是清除日志(图14-15)。这很容易,但清除缓存并不那么容易。我发誓我在某个时候做了它并且它正在工作(我必须销毁 UIWebView 才能使其工作),但后来我改变了其他东西,它就停止工作了。我怀疑的变化是当我删除了最初用于 UIWebView 的.xib/.nib文件时。

The other thing is clearing the log (Figure 14-15). That’s easy, but clearing the cache didn’t prove to be so easy. I swear I did it at some point and it was working (I had to destroy the UIWebView to make it work), but then I changed something else and it stopped working. The change I suspect is when I deleted the .xib/.nib file I originally had for the UIWebView.

清除日志和浏览器缓存

图 14-15。清除日志和浏览器缓存

Figure 14-15. Clearing log and browser cache

前方的路

The Road Ahead

前面的路是围绕 HAR 的。

The road ahead is around HAR.

正如你所看到的,我们可以查看请求/响应,但如果能有 yslow 分数、页面速度分数、缩小的潜在优势等一堆工具,那就太好了。我的想法是将性能智能工具与收集原始数据的机制分开。胶水是HAR。

As you can see we can look at requests/responses, but it would be nice also to have things like a yslow score, page speed score, potential wins of minification, etc.—a bunch of tools. My idea is to separate the tools of performance intelligence from the mechanics of collecting the raw data. And the glue is HAR.

我们有在线 HAR 查看器 ( http://www.softwareihard.com/har/viewer/ ),因此无需构建瀑布图,只需向其传递 HAR 文件即可。

We have the online HAR viewer (http://www.softwareishard.com/har/viewer/) so no need to build waterfall diagrams, just pass it a HAR file.

我们现在有了 YSlow 命令行,启动 Web UI 只是时间问题。它应该接受 HAR 并在其上运行所有 YSlow 智能。PageSpeed 也是如此。我不必集成所有工具,icy而是icy打开 Safari,指向工具的 URL,并向其传递 HAR。不用说,工具 URL 应该是可配置的,这样您就可以运行自己的工具,甚至是内部工具。

We now have a YSlow command line, which will be a question of time to get a Web UI going. It should accept a HAR and run all the YSlow intelligence on it. Same for PageSpeed. I shouldn’t have to integrate all tools in icy but rather have icy open Safari, point to a URL of a tool, and pass it a HAR. Needless to say tool URLs should be configurable so you can run your own, even in-house, tools.

可以icy帮助解决的是UIWebView. 只是获取尽可能最好的数据,创建 HAR 并将其传递。这就是我所说的收集原始数据的机制,即“它就是这样”的数据。与 YSlow 等智能工具不同,YSlow 可以回答以下问题:“我这里有这个页面,那么接下来怎么办?”

What icy can help address is the visibility into the UIWebView. Just getting the best data possible, creating a HAR and passing it on. This is what I call the mechanics of gathering the raw data, the “it is what it is” data. As opposed to the intelligence of tools like YSlow that can answer the question: “I have this page here, so what next?”

我希望我们 Web 性能社区能够在每一个可能发出网络请求的设备上拥有这些小型轻量级“代理”,这样我们就可以收集原始 HTTP 数据并将其传递给优秀的旧工具以征求他们的意见。我们还需要知道运营商可能会采取哪些优化措施。所以…

And I’m hoping we, the web performance community, will have these little lightweight “agents” on every possible device that makes network requests, so we can gather the raw HTTP data and pass it to the good old tools for their opinions. We also need to know what possible optimizations carriers do. So…

圣诞节我想要的一切……

All I Want for Christmas…

……是更多的工具。我们只能改进我们所知道的。因此,了解正在发生的事情至关重要。

…is more tools. We can only improve what we know about. Therefore visibility into what’s going on is critical.

这个小icy应用程序只是一个例子,有点像对制造商、手机制造商、浏览器供应商说的——这就是我们想要的,现在给我!

This little icy app is just an example, sort of saying to manufacturers, phone builders, browser vendors—here’s what we want, now gimme!

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/i-see-http/。最初发布于 2011 年 12 月 14 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/i-see-http/. Originally published on Dec 14, 2011.

第 15 章使用智能缓存避免机器人性能税

Chapter 15. Using Intelligent Caching to Avoid the Bot Performance Tax

马修 ·普林斯

Matthew Prince

2004 年,Lee Holloway ( https://twitter.com/icqheretic ) 和我启动了 Project Honey Pot ( http://www.projecthoneypot.org/ )。该网站跟踪在线欺诈和滥用行为,主要由报告 IP 地址声誉的网页组成。虽然我们的资源有限并试图充分利用它们,但我刚刚检查了 Google,它在其索引中列出了超过 3100 万个页面,这些页面构成了 www.projecthoneypot.org ( http://www.projecthoneypot.org/ )网站。

In 2004, Lee Holloway (https://twitter.com/icqheretic) and I started Project Honey Pot (http://www.projecthoneypot.org/). The site, which tracks online fraud and abuse, primarily consists of web pages that report the reputation of IP addresses. While we had limited resources and tried to get the most of them, I just checked Google which lists more than 31 million pages in its index that make up the www.projecthoneypot.org (http://www.projecthoneypot.org/) site.

Project Honey Pot 的页面相对简单且资产较少,但与当今的许多网站一样,它们包含重要的动态内容,这些内容会以不可预测的时间间隔定期更新。为了提供近乎实时的更新,页面需要由数据库驱动。

Project Honey Pot’s pages are relatively simple and asset-light, but like many sites today they include significant dynamic content that is regularly updated at unpredictable intervals. To deliver near realtime updates, the pages need to be database driven.

为了最大限度地提高网站的性能,从一开始我们就使用了多个不同的缓存层来存储最常访问的页面。Lee 的背景是高性能数据库设计,他研究了 Google Analytics 等服务的报告,以了解访问者如何浏览网站并构建缓存以防止经常访问的页面需要访问数据库。

To maximize performance of the site, from the beginning we used a number of different caching layers to store the most frequently accessed pages. Lee, whose background is high-performance database design, studied reports from services like Google Analytics to understand how visitors moved through the site and built caching to keep regularly accessed pages from needing to hit the database.

我们认为自己很聪明,但是尽管遵循了 Web 应用程序性能设计的最佳实践,网站还是会以惊人的频率陷入瘫痪。事实证明,罪魁祸首是许多优化网络性能的人意想不到的、隐藏的东西:自动化机器人。

We thought we were pretty smart but, in spite of following the best practices of web application performance design, with alarming frequency the site would grind to a halt. The culprit turned out to be something unexpected and hidden from the view of many people optimizing web performance: automated bots.

一般网站超过 20% 的请求来自某种自动化机器人。这些机器人包括搜索引擎爬虫等常见的机器人,但也包括扫描漏洞或收集数据的恶意机器人。我们一直在 CloudFlare 网络上数十万个站点上跟踪这些数据,发现平均而言,大约 15% 的 Web 总请求源自一种或另一种形式的 Web 威胁 (http://blog.cloudflare . com/do-hackers-take-the-holidays-off),根据日期上下波动(图15-1

The average website sees more than 20% of its requests coming from some sort of automated bot. These bots include the usual suspects like search engine crawlers, but also include malicious bots scanning for vulnerabilities or harvesting data. We’ve been tracking this data at CloudFlare across hundreds of thousands of sites on our network and have found that on average, approximately 15% of web total requests originate a web threat of one form or another (http://blog.cloudflare.com/do-hackers-take-the-holidays-off), with swings up and down depending on the day (Figure 15-1)

假期的袭击

图 15-1。假期的袭击

Figure 15-1. Attack of the holidays

在 Project Honey Pot 的案例中,这些机器人的流量对性能产生了重大影响。因为它们不遵循典型的人类访问模式,所以它们经常触发我们缓存中不热门的页面。此外,由于机器人通常不会像 Google Analytics 等系统中使用的那样发射 Javascript 信标,因此它们的流量及其影响并不是立即显而易见的。

In Project Honey Pot’s case, the traffic from these bots had a significant performance impact. Because they did not follow the typical human visitation pattern, they were often triggering pages that weren’t hot in our cache. Moreover, since the bots typically didn’t fire Javascript beacons like those used in systems like Google Analytics, their traffic and its impact weren’t immediately obvious.

为了解决这个问题,我们实现了两个不同的系统来处理两种不同类型的机器人。由于我们拥有有关网络威胁的大量数据,因此我们能够利用这些数据来限制已知的恶意爬虫请求网站上的动态页面。仅仅取消威胁流量就会产生立竿见影的效果,并为合法访问者释放数据库资源。

To solve the problem, we implemented two different systems to deal with two different types of bots. Because we had great data on web threats, we were able to leverage that to restrict known malicious crawlers from requesting dynamic pages on the site. Just taking off the threat traffic had an immediate impact and freed up database resources for legitimate visitors.

同样的方法对于其他类型的自动化机器人:搜索引擎爬虫来说没有意义。我们希望通过在线搜索找到 Project Honey Pot 的页面,因此我们不想完全阻止搜索引擎爬虫。然而,尽管消除了威胁流量,但 Google、Yahoo 和 Microsoft 的爬虫同时访问该网站有时会导致 Web 服务器和数据库速度减慢。

The same approach didn’t make sense for the other type of automated bots: search engine crawlers. We wanted Project Honey Pot’s pages to be found through online searches, so we didn’t want to block search engine crawlers entirely. However, in spite of removing the threat traffic, Google, Yahoo, and Microsoft’s crawlers all accessing the site at the same time would sometimes cause the web server and database to slow to a crawl.

解决方案是修改我们的缓存策略。虽然我们希望向人类访问者提供最新结果,但我们开始从具有较长生存时间 (TTL) 的缓存中提供搜索爬虫。我们尝试了正确的页面 TTL,但最终确定 1 天是蜜罐项目网站的最佳选择。如果 Google 今天抓取了某个页面,然后百度在接下来的 24 小时内减少了对同一页面的请求,我们将返回缓存版本,而不从数据库重新生成该页面。

The solution was a modification of our caching strategy. While we wanted to deliver the latest results to human visitors, we began serving search crawlers from a cache with a longer time to live (TTL). We experimented with the right TTLs for pages, but eventually settled on 1 day as being optimal for the Project Honey Pot site. If a page is crawled by Google today and then Baidu requests the same page less in the next 24 hours, we return the cached version without regenerating the page from the database.

搜索引擎本质上看的是互联网的快照。虽然重要的是不要向爬虫提供欺骗性的不同内容,但修改缓存策略以最大程度地减少它们对 Web 应用程序的性能影响完全符合良好 Web 实践的范围。

Search engines, by their nature, see a snapshot of the Internet. While it is important to not serve deceptively different content to their crawlers, modifying your caching strategy to minimize their performance impact on your web application is well within the bounds of good web practices.

自从启动 CloudFlare ( https://www.cloudflare.com/ ) 以来,我们采用了在 Project Honey Pot 中开发的缓存策略,并使其更加智能和动态,以优化性能。我们会根据网站的特征自动调整搜索爬虫 TTL,并且非常擅长防止恶意爬虫攻击您的 Web 应用程序。平均而言,我们能够从 Web 应用程序卸载 70% 的请求 - 考虑到整个 CloudFlare 配置过程大约需要 5 分钟,这令人惊叹。虽然这种性能优势的一部分来自于传统的类似 CDN 的缓存,但一些最大的缓存优势实际上来自于处理机器人的深度页面视图,而传统的缓存策略无法缓解这种情况。

Since starting CloudFlare (https://www.cloudflare.com/), we’ve taken the caching strategy we developed at Project Honey Pot and made it more intelligent and dynamic to optimize performance. We automatically tune the search crawler TTL to the characteristics of the site, and are very good at keeping malicious crawlers from ever hitting your web application. On average, we’re able to offload 70% of the requests from a web application — which is stunning given the entire CloudFlare configuration process takes about 5 minutes. While some of this performance benefit comes from traditional CDN-like caching, some of the biggest cache wins actually come from handling bots’ deep page views that aren’t alleviated by traditional caching strategies.

结果可能是戏剧性的。例如,SXSW 的网站采用了广泛的传统 Web 应用程序和数据库缓存系统,但能够将其 Web 服务器和数据库计算机上的负载减少 50% 以上 ( http://blog.cloudflare.com/cloudflare-powers-the- sxsw-panel-picker)很大程度上是因为 CloudFlare 的机器人感知缓存(图 15-2)。

The results can be dramatic. For example, SXSW’s website employs extensive traditional web application and database caching systems but was able to reduce the load on their web servers and database machines by more than 50% (http://blog.cloudflare.com/cloudflare-powers-the-sxsw-panel-picker) in large part because of CloudFlare’s bot-aware caching (Figure 15-2).

机器人感知的缓存结果

图 15-2。机器人感知的缓存结果

Figure 15-2. Bot-aware caching results

当您调整 Web 应用程序以获得最佳性能时,如果您只查看基于信标的分析工具(例如 Google Analytics),您可能会错过 Web 应用程序负载的最大来源之一。这就是 CloudFlare 的分析报告所有访问者对您网站的访问情况的原因。即使没有 CloudFlare,挖掘原始服务器日志、了解机器人并构建区分不同类别访问者行为的缓存策略也可能是任何网站 Web 性能策略的一个重要方面。

When you’re tuning your web application for maximum performance, if you’re only looking at a beacon-based analytics tool like Google Analytics you may be missing one of the biggest sources of web application load. This is why CloudFlare’s analytics reports the visits from all visitors to your site. Even without CloudFlare, digging through your raw server logs, being bot-aware, and building caching strategies that differentiate between the behaviors of different classes of visitors can be an important aspect of any site’s web performance strategy.

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/using-intelligent-caching-to-avoid-the-bot-performance-tax/。最初发布于 2011 年 12 月 15 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/using-intelligent-caching-to-avoid-the-bot-performance-tax/. Originally published on Dec 15, 2011.

第 16 章导航计时 API 实用指南

Chapter 16. A Practical Guide to the Navigation Timing API

巴迪· 布鲁尔

Buddy Brewer

导航计时 ( http://dvcs.w3.org/hg/webperf/raw-file/tip/specs/NavigationTiming/Overview.html ) 是来自 W3C 的 Web 性能工作组 ( http://www.w3.html)的 API 。 org/2010/webperf/),公开有关网页性能的数据。导航计时是一项重大的新开发,因为它使您能够从真实用户收集细粒度的性能指标,包括在基于 Javascript 的跟踪器有机会加载之前发生的事件。这使我们能够直接测量 DNS 分辨率、连接延迟以及从真实用户的浏览器内部到达第一个字节的时间等内容。

Navigation Timing (http://dvcs.w3.org/hg/webperf/raw-file/tip/specs/NavigationTiming/Overview.html) is an API from the W3C’s Web Performance Working Group (http://www.w3.org/2010/webperf/) that exposes data about the performance of your web pages. Navigation Timing is a major new development because it enables you to collect fine-grained performance metrics from real users, including events that happen before Javascript-based trackers have a chance to load. This gives us the ability to directly measure things like DNS resolution, connection latency, and time to first byte from inside the browsers of real users.

为什么你应该关心

Why You Should Care

我职业生涯的前八年都在构建综合监控产品,但我现在相信,在了解网站性能时,真实的用户监控应该是您首选的“真相”来源。这并不意味着您应该放弃综合监控,但今天我将其视为对真实用户监控的有用补充,而不是其本身的完整性能解决方案。

I spent the first eight years of my career building synthetic monitoring products but I now believe real user monitoring should be your preferred source of “The Truth” when it comes to understanding the performance of your site. That doesn’t mean you should throw away your synthetic monitoring, but today I view it as a useful complement to real user monitoring rather than a complete performance solution in itself.

真实的用户监控至关重要,因为它可以最准确地描述用户所使用的浏览器、位置和网络的真实体验。这是实际衡量缓存决策如何影响用户体验的唯一方法。衡量真实的人(使用真实的个性和真实的信用卡)还使您有机会在同一环境中收集绩效和业务指标,以便您可以了解加载时间对转化率和跳出率等关键业务指标的影响。

Real user monitoring is critical because it provides the most accurate portrayal of the true experience across the browsers, locations, and networks your users are on. It is the only way to realistically measure how your caching decisions impact the user experience. Measuring real people (with real personalities and real credit cards) also gives you an opportunity to collect performance and business metrics in the same context, so you can see what impact load times are having on key business metrics like conversion and bounce rates.

我们在导航授时方面面临的最大问题是没有一个好的系统来收集和分析原始数据。在本章中,我将描述这个问题的解决方案,可以使用免费工具快速部署。

The biggest problem we face with Navigation Timing is that there isn’t a good system for collecting and analyzing the raw data. In this chapter, I’ll describe a solution to this problem that can be quickly deployed using free tools.

收集导航计时时间戳并将其转化为有用的测量结果

Collecting Navigation Timing Timestamps and Turning Them into Useful Measurements

window.performance.timing 对象以相对于纪元的时间戳形式给出其所有指标。为了将这些转化为有用的测量结果,我们需要确定一个通用词汇并进行一些算术运算。我建议从以下几点开始:

The window.performance.timing object gives all of its metrics in the form of timestamps relative to the epoch. In order to turn these into useful measurements, we need to settle on a common vocabulary and do some arithmetic. I suggest starting with the following:

function getPerfStats() {
  var timing = window.performance.timing;
  return {
    dns: timing.domainLookupEnd - timing.domainLookupStart,
    connect: timing.connectEnd - timing.connectStart,
    ttfb: timing.responseStart - timing.connectEnd,
    basePage: timing.responseEnd - timing.responseStart,
    frontEnd: timing.loadEventStart - timing.responseEnd
  };
}
function getPerfStats() {
  var timing = window.performance.timing;
  return {
    dns: timing.domainLookupEnd - timing.domainLookupStart,
    connect: timing.connectEnd - timing.connectStart,
    ttfb: timing.responseStart - timing.connectEnd,
    basePage: timing.responseEnd - timing.responseStart,
    frontEnd: timing.loadEventStart - timing.responseEnd
  };
}

这为您提供了一个类似于您在综合监控工具中常见的瀑布组件的起点。收集这些数据一段时间并将其与您的合成数据进行比较以了解它们的接近程度会很有趣。

This gives you a starting point that is similar to the waterfall components you commonly see in synthetic monitoring tools. It would be interesting to collect this data for a while and compare it to your synthetic data to see how close they are.

使用 Google Analytics 作为性能数据仓库

Using Google Analytics as a Performance Data Warehouse

接下来我们需要一个地方来存储我们收集的数据。您可以编写自己的信标服务,或者简单地对查询字符串上的值进行编码,将它们记录在 Web 服务器的访问日志中,然后编写一个程序来解析和分析结果。然而这些都是耗时的方法。我们正在寻找能够以最低成本快速启动和运行的东西。输入谷歌分析(http://www.google.com/analytics/)。

Next we need a place to store the data we’re collecting. You could write your own beacon service or simply encode the values on a query string, log them in your web server’s access logs, and write a program to parse and analyze the results. However these are time-consuming approaches. We’re looking for something we can get up and running quickly and at minimal cost. Enter Google Analytics (http://www.google.com/analytics/).

Google Analytics 是互联网上最流行的免费网站分析系统。虽然 GA 在其网站速度分析报告 ( http://analytics.blogspot.com/2011/05/measure-page-load-time-with-site-speed.html ) 中自动提供基本性能指标,但它基于数据样本,仅报告总页面加载时间。我们可以通过使用 GA 的事件跟踪功能来存储和分析我们的细粒度导航计时指标来改进这一点:

Google Analytics is the most popular free web site analytics system on the Internet. While GA automatically provides basic performance metrics in its Site Speed Analytics Report (http://analytics.blogspot.com/2011/05/measure-page-load-time-with-site-speed.html), it is based on a sample of data and only reports on the total page load time. We can improve this by using GA’s event tracking capability to store and analyze our fine-grained Navigation Timing metrics:

window.onload = function() {
  if (window.performance && window.performance.timing) {
    var ntStats = getPerfStats();
    _gaq.push(["_trackEvent", "Navigation Timing", "DNS", undefined, ntStats.dns, true]);
    _gaq.push(["_trackEvent", "Navigation Timing", "Connect", undefined, ntStats.connect, true]);
    _gaq.push(["_trackEvent", "Navigation Timing", "TTFB", undefined, ntStats.ttfb, true]);
    _gaq.push(["_trackEvent", "Navigation Timing", "BasePage", undefined, ntStats.basePage, true]);
    _gaq.push(["_trackEvent", "Navigation Timing", "FrontEnd", undefined, ntStats.frontEnd, true]);
  }
};
window.onload = function() {
  if (window.performance && window.performance.timing) {
    var ntStats = getPerfStats();
    _gaq.push(["_trackEvent", "Navigation Timing", "DNS", undefined, ntStats.dns, true]);
    _gaq.push(["_trackEvent", "Navigation Timing", "Connect", undefined, ntStats.connect, true]);
    _gaq.push(["_trackEvent", "Navigation Timing", "TTFB", undefined, ntStats.ttfb, true]);
    _gaq.push(["_trackEvent", "Navigation Timing", "BasePage", undefined, ntStats.basePage, true]);
    _gaq.push(["_trackEvent", "Navigation Timing", "FrontEnd", undefined, ntStats.frontEnd, true]);
  }
};

前面的代码触发五个事件来传输我们的五个性能测量结果。我们正在等待加载事件,以确保我们获得前端时间的有效测量。如果我们不关心前端性能,我们可以在页面加载期间的任何时候触发事件。每次调用中的最终true参数非常重要,可以确保事件不会被 Google Analytics 误解为用户交互,从而影响跳出率计算。

The preceding code fires five events to transmit our five performance measurements. We are waiting until the load event to ensure we get a valid measurement of the front end time. If we weren’t concerned with front end performance, we could fire the events at any point during page load. The final true parameter in each call is important to ensure that the events don’t get misinterpreted by Google Analytics as user interactions, which would skew bounce rate calculations.

有关详细信息,请参阅 Google Analytics 事件跟踪指南 ( http://code.google.com/apis/analytics/docs/tracking/eventTrackerGuide.html )。

For more information see the Google Analytics Event Tracking Guide (http://code.google.com/apis/analytics/docs/tracking/eventTrackerGuide.html).

Google Analytics 中的性能报告

Reporting on Performance in Google Analytics

现在我们已经在 Google Analytics 中收集了导航计时数据,是时候运行一些报告了。登录 Google Analytics,然后单击内容事件热门事件。单击 “事件类别”列表下的“导航计时”,GA 将显示一个表格,其中显示五个性能维度中每个维度的测量数量和平均值。此视图还允许您绘制随时间变化的五个维度中任意一个的平均值(图 16-1)。

Now that we’ve collected our Navigation Timing data in Google Analytics, it’s time to run some reports. Log into Google Analytics and click ContentEventsTop Events. Click on Navigation Timing under the Event Category list and GA displays a table showing the number of measurements and average value for each of our five performance dimensions. This view also lets you plot the average value of any of the five dimensions over time (Figure 16-1).

Google Analytics 报告示例

图 16-1。Google Analytics 报告示例

Figure 16-1. Example Google Analytics Report

局限性

Limitations

这种方法的优点是可以使用免费提供的工具和技术快速设置。但与大多数快速且便宜的东西一样,它也有一些缺点:

This approach has the advantage of being quick to set up using freely available tools and techniques. But as with most things that are fast and cheap, it has a few shortcomings:

缺乏浏览器覆盖
Lack of browser coverage

导航计时在 Safari(桌面版或移动版)中尚不可用,并且显然在未来一段时间内的旧版本浏览器中也将不可用。使用一部分浏览器进行测试可能适合在页面开始解析之前测量条件,但是当您开始查看前端性能时,某些浏览器缺乏数据会产生更大的影响。

Navigation Timing isn’t yet available in Safari (desktop or mobile) and obviously won’t be available in legacy versions of browsers that will be around for some time to come. Testing with a subset of browsers is probably fine for measuring conditions before the page starts getting parsed, but when you begin looking at frontend performance the lack of data from certain browsers has a bigger impact.

没有对象级别数据
No object level data

综合监控仍然占据主导地位。W3C 资源计时 ( http://dvcs.w3.org/hg/webperf/raw-file/tip/specs/ResourceTiming/Overview.html ) 规范承诺将来提供来自真实用户的对象级数据,但截至这篇文章在任何流行的浏览器中都不可用。

Synthetic monitoring still rules the roost here. The W3C Resource Timing (http://dvcs.w3.org/hg/webperf/raw-file/tip/specs/ResourceTiming/Overview.html) specification promises to provide object level data from real users in the future, but as of this writing it isn’t available in any popular browsers.

受限于 Google Analytics 报告系统的功能
Limited to the capabilities of the Google Analytics reporting system

使用 Google Analytics,您必须接受所给的东西。您可以生成并绘制测量平均值,但您不会获得百分位数、降级警报或您习惯从性能监控工具看到的许多其他功能。

With Google Analytics, you have to take what you’re given. You can generate and plot averages of measurements, but you won’t get percentiles, degradation alerts, or many other features you are accustomed to seeing from performance monitoring tools.

最后的想法

Final Thoughts

既然导航计时已在排名前三的浏览器中可用,每个人都应该在其性能工具箱中拥有某种形式的真实用户监控。上面概述的方法并不完美,但它可以免费为您提供基本的覆盖范围,并且只需付出最少的努力。

Now that Navigation Timing is available in the top three browsers, everyone should have some form of real user monitoring in their performance toolbox. The approach outlined above isn’t perfect but it gives you a basic level of coverage at no cost and minimal effort.

我的公司 Log Normal ( http://www.lognormal.com/ ) 正在构建优质的真实用户监控解决方案,旨在让您尽可能深入地了解真实用户的性能。如果您有兴趣了解更多信息,请访问我们的网站,并请求测试版邀请 ( http://www.lognormal.com/ )。

My company, Log Normal (http://www.lognormal.com/), is building a premium real user monitoring solution that aims to give you the best possible insight into real user performance. If you’re interested in learning more, head over to our website, and request a beta invitation (http://www.lognormal.com/).

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/a-practical-guide-to-the-navigation-timing-api/。最初发布于 2011 年 12 月 16 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/a-practical-guide-to-the-navigation-timing-api/. Originally published on Dec 16, 2011.

第 17 章响应时间如何影响业务

Chapter 17. How Response Times Impact Business

亚历山大 ·波德尔科

Alexander Podelko

看起来人们对量化性能对业务的影响非常感兴趣,将响应时间与收入和客户满意度联系起来。发布了很多信息,例如 Aberdeen Group 报告“客户在一秒钟内赢得或失去”,或者 Gomez 白皮书“为什么 Web 性能很重要:您的网站是否正在赶走客户?毫无疑问,响应时间和业务指标之间存在很强的相关性,拥有此类文档来证明性能工程工作的合理性非常好——从实践的角度来看,进行一些简化可能会很好——但我们应该坚持请注意,这种关系并不是那么简单和线性,在某些情况下它可能很重要。

It looks like there is great interest to quantifying performance impact on business, linking response time to income and customer satisfaction. A lot of information was published, for example, the Aberdeen Group report, “Customers Are Won or Lost in One Second”, or the Gomez whitepaper “Why Web Performance Matters: Is Your Site Driving Customers Away?” There is no doubt that there is a strong correlation between response times and business metrics and it is very good to have such documents to justify performance engineering efforts—and some simplification may be good from the practical point of view—but we should keep in mind that the relationship is not so simple and linear and there may be cases when it would matter.

响应时间可以被视为可用性要求,并且基于人机交互的基本原理。早在 1968 年,罗伯特·米勒 (Robert Miller) 的论文“人机对话事务中的响应时间”就描述了人类注意力的三个阈值水平。Jakob Nielsen 认为 Miller 的指导方针是人机交互的基础 ( http://www.useit.com/papers/responsetime.html ),因此它们仍然有效,并且不太可能随着未来技术的发展而改变。这三个阈值是:

Response times may be considered as usability requirements and are based on the basic principles of human-computer interaction. As long ago as 1968, Robert Miller’s paper “Response Time in Man-Computer Conversational Transactions” described three threshold levels of human attention. Jakob Nielsen believes that Miller’s guidelines are fundamental for human-computer interaction (http://www.useit.com/papers/responsetime.html), so they are still valid and not likely to change with whatever technology comes next. These three thresholds are:

  • 用户将响应时间视为瞬时(0.1-0.2 秒)

  • Users view response time as instantaneous (0.1-0.2 second)

  • 用户感觉他们正在与信息自由交互(1-5 秒)

  • Users feel they are interacting freely with the information (1-5 seconds)

  • 用户将注意力集中在对话框上(5-10 秒)

  • Users are focused on the dialog box (5-10 seconds)

用户将响应时间视为瞬时(0.1-0.2秒):用户感觉他们直接操作用户界面中的对象。例如,从用户选择表中的一列到该列突出显示的时间,或者键入符号和其出现在屏幕上之间的时间。罗伯特·米勒 (Robert Miller) 报告称该阈值为 0.1 秒。根据 Peter Bickford 的说法,0.2 秒形成了似乎同时发生的事件和彼此回响的事件之间的心理界限 ( http://web.archive.org/web/20040913083444/http://developer.netscape.com /viewsource/bickford_wait.htm)。

Users view response time as instantaneous (0.1-0.2 seconds): Users feel that they directly manipulate objects in the user interface. For example, the time from the moment the user selects a column in a table until that column highlights or the time between typing a symbol and its appearance on the screen. Robert Miller reported that threshold as 0.1 seconds. According to Peter Bickford 0.2 seconds forms the mental boundary between events that seem to happen together and those that appear as echoes of each other (http://web.archive.org/web/20040913083444/http://developer.netscape.com/viewsource/bickford_wait.htm).

尽管这是一个相当重要的门槛,但它往往超出了应用程序开发人员的能力范围。这种交互由操作系统、浏览器或接口库提供,通常发生在客户端,而不与服务器交互(哑终端除外,这对于当今的业务系统来说是一个例外)。然而,新的丰富的 Web 界面可能会使这个阈值变得很重要。例如,如果存在处理用户输入的逻辑,因此屏幕导航或符号输入变慢,即使响应时间相对较短,也可能会导致用户沮丧。

Although it is a quite important threshold, it is often beyond the reach of application developers. That kind of interaction is provided by operating system, browser, or interface libraries, and usually happens on the client side, without interaction with servers (except for dumb terminals, that is rather an exception for business systems today). However new rich web interfaces may make this threshold important for consideration. For example, if there is logic processing user input so screen navigation or symbol typing becomes slow, it may cause user frustration even with relatively small response times.

用户感觉他们正在与信息自由交互(1-5 秒):他们注意到延迟,但感觉计算机正在按照命令“工作”。用户的思维流动保持不间断。罗伯特·米勒报告这个阈值是一两秒。

Users feel they are interacting freely with the information (1-5 seconds): They notice the delay, but feel that the computer is “working” on the command. The user’s flow of thought stays uninterrupted. Robert Miller reported this threshold as one-two seconds.

Peter Sevcik 确定了影响此阈值的两个关键因素 ( http://www.netforecast.com/Articles/BCR%20C26%20How%20Fast%20is%20Fast%20Enough.pdf ):查看的元素数量和任务的重复性。例如,查看的元素的数量是用户查看的项目、字段或段落的数量。用户愿意等待的时间似乎是感知到的请求复杂性的函数。

Peter Sevcik identified two key factors impacting this threshold (http://www.netforecast.com/Articles/BCR%20C26%20How%20Fast%20is%20Fast%20Enough.pdf): the number of elements viewed and the repetitiveness of the task. The number of elements viewed is, for example, the number of items, fields, or paragraphs the user looks at. The amount of time the user is willing to wait appears to be a function of the perceived complexity of the request.

早在 20 世纪 60 年代到 80 年代,终端界面相当简单,典型的任务是数据输入,通常一次一个元素。因此早期的研究人员报告说,一到两秒是保持最大生产力的阈值。具有许多元素的现代复杂用户界面可能具有更长的响应时间,而不会对用户生产力产生不利影响。用户还会以一定的速度与应用程序进行交互,具体取决于每个任务的重复程度。有些是高度重复的;其他则要求用户在进入下一个屏幕之前思考并做出选择。任务重复性越高,响应时间就应该越好。

Back in 1960s through 1980s, the terminal interface was rather simple and a typical task was data entry, often one element at a time. So earlier researchers reported that one to two seconds was the threshold to keep maximal productivity. Modern complex user interfaces with many elements may have higher response times without adversely impacting user productivity. Users also interact with applications at a certain pace depending on how repetitive each task is. Some are highly repetitive; others require the user to think and make choices before proceeding to the next screen. The more repetitive the task is the better the response time should be.

这是为我们提供大多数用户交互应用程序的响应时间可用性目标的阈值。响应时间超过此阈值会降低生产力。确切的数字取决于许多难以形式化的因素,例如查看的元素的数量和类型或任务的重复性,但对于大多数典型的业务应用程序来说,两到五秒的目标是合理的。

That is the threshold that gives us response time usability goals for most user-interactive applications. Response times above this threshold degrade productivity. Exact numbers depend on many difficult-to-formalize factors, such as the number and types of elements viewed or repetitiveness of the task, but a goal of two to five seconds is reasonable for most typical business applications.

有研究人员认为响应时间预期会随着时间的推移而增加。2009 年 Forrester 研究 ( http://www.akamai.com/html/about/press/releases/2009/press_091409.html) 建议两秒响应时间;2006 年,类似的研究表明需要 4 秒(两项研究工作均由 Web 加速解决方案提供商 Akamai 赞助)。虽然这种趋势可能存在(至少对于互联网和移动应用程序而言,最近人们的期望发生了很大变化),但这项研究的方法经常受到质疑,因为他们只是询问用户。众所周知,用户对时间的感知可能会产生误导。此外,如前所述,响应时间预期取决于查看的元素数量、任务的重复性、用户对系统正在执行的操作的假设以及与用户的界面交互。在没有具体说明我们正在讨论的页面的情况下陈述标准可能会过于笼统。

There are researchers who suggest that response time expectations increase with time. Forrester research of 2009 (http://www.akamai.com/html/about/press/releases/2009/press_091409.html) suggests two second response time; in 2006 similar research suggested four seconds (both research efforts were sponsored by Akamai, a provider of web accelerating solutions). While the trend probably exists (at least for the Internet and mobile applications, where expectations changed a lot recently), the approach of this research was often questioned because they just asked users. It is known that user perception of time may be misleading. Also, as mentioned earlier, response time expectations depends on the number of elements viewed, the repetitiveness of the task, user assumptions of what the system is doing, and interface interactions with the user. Stating a standard without specification of what page we are talking about may be overgeneralization.

用户将注意力集中在对话框上(5-10 秒):他们将注意力集中在任务上。Robert Miller 报告阈值为 10 秒。当用户在延迟超过此阈值后返回任务时,可能需要重新调整自己的方向,因此生产力会受到影响。或者,如果我们谈论的是网站,那么它就是用户开始放弃该网站的门槛。

Users are focused on the dialog box (5-10 seconds): They keep their attention on the task. Robert Miller reported threshold as 10 seconds. Users will probably need to reorient themselves when they return to the task after a delay above this threshold, so productivity suffers. Or, if we are talking about Web sites, it is the threshold when users start abandoning the site.

Peter Bickford 调查了用户的反应,在 27 次几乎瞬时的响应之后,同一操作出现第 28 次 2 分钟的等待循环 ( http://web.archive.org/web/20040913083444/http://developer.netscape .com/viewsource/bickford_wait.htm)。一半的受试者只用了 8.5 秒就退出或重启。在等待期间切换到手表光标使对象的离开延迟了大约 20 秒。动画的手表光标可以持续一分多钟,进度条让用户一直等到最后。Bickford 的结果被广泛用于设置 Web 应用程序的响应时间要求。

Peter Bickford investigated user reactions when, after 27 almost instantaneous responses, there was a 2 minute wait loop for the 28th time for the same operation (http://web.archive.org/web/20040913083444/http://developer.netscape.com/viewsource/bickford_wait.htm). It took only 8.5 seconds for half the subjects to either walk out or hit the reboot. Switching to a watch cursor during the wait delayed the subject’s departure for about 20 seconds. An animated watch cursor was good for more than a minute, and a progress bar kept users waiting until the end. Bickford’s results were widely used for setting response times requirements for web applications.

这是为我们提供大多数用户交互应用程序的响应时间可用性要求的阈值。超过此阈值的响应时间会导致用户失去注意力并导致沮丧。确切的数字因所使用的界面而异,但看起来在大多数情况下响应时间不应超过 8 到 10 秒。不过,阈值不应该盲目应用;在许多情况下,当实施适当的用户界面来缓解问题时,明显更高的响应时间是可以接受的。

That is the threshold that gives us response time usability requirements for most user-interactive applications. Response times above this threshold cause users to lose focus and lead to frustration. Exact numbers vary significantly depending on the interface used, but it looks like response times should not be more than 8 to 10 seconds in most cases. Still, the threshold shouldn’t be applied blindly; in many cases, significantly higher response times may be acceptable when appropriate user interface is implemented to alleviate the problem.

因此,虽然响应时间和业务指标之间存在很强的相关性,但它绝对不是线性函数。我们接触的是人机交互的心理学,它绝对不是一个单一维度的问题。它是非常具体的上下文,应谨慎使用已发布的数据,并了解其背后的真正含义。主要的实际结论是,您可能会遇到进一步的性能改进没有多大意义的情况:您的性能改进成本不断增加,而业务价值却不断减少。尽管看起来大多数现有系统还没有达到这一点。

So while there is a strong correlation between response times and business metrics, it is definitely not a linear function. We are touching on the psychology of human-computer interaction and it is definitely not a single-dimension issue. It is very context-specific and published data should be used carefully with understanding what really stands behind them. The main practical conclusion is that you may have a point when further performance improvement won’t make much sense: you have increasing costs of performance improvement with diminishing business value. Although it looks like most existing systems haven’t reached this point yet.

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/how-response-times-impact-business/。最初发布于 2011 年 12 月 17 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/how-response-times-impact-business/. Originally published on Dec 17, 2011.

第 18 章移动 UI 性能注意事项

Chapter 18. Mobile UI Performance Considerations

埃斯特尔 ·韦尔

Estelle Weyl

移动部分是互联网用户增长最快的部分。如果您的网站可以通过移动浏览器访问,您会注意到您的移动操作系统统计数据一直在快速增加。考虑到移动设备的开发将改善所有设备上的用户体验,而不仅仅是手机。无论您是否首先针对移动设备进行设计 ( http://www.lukew.com/ff/entry.asp?933 ),您在开发 Web 应用程序时肯定需要考虑移动性能。

The mobile segment is the fastest growing segment of Internet users. If your site is accessible via the mobile browser, you’ll notice that your mobile OS stats has been increasing rapidly. Developing with mobile in mind will improve user experience on all devices, not just phones. Whether or not you design for mobile first (http://www.lukew.com/ff/entry.asp?933), you definitely need to consider mobile performance when developing web applications.

移动设备的浏览器可能与个人计算机上的浏览器类似,甚至功能更强大。即使使用更先进的浏览器,设备本身也可能具有与您 1999 年使用的 Pentium III 类似的内存和带宽限制。虽然您的用户可能使用类似的应用程序来访问您的网站,但设备本身会创建您需要的各种限制开发时考虑。

Mobile devices may have browsers that are similar to, or even more featured, than the browsers on personal computers. Even with more advanced browsers, the devices themselves may have similar memory and bandwidth constraints to the Pentium III you were using back in 1999. While your users may be using similar applications to access your sites, the devices themselves create various constraints that you need to consider during development.

对于移动设备,您需要在整个开发过程中考虑电池寿命、延迟、内存和 UI 响应能力。

When it comes to mobile, you need to take battery life, latency, memory, and UI responsiveness into consideration throughout the development process.

电池寿命

Battery Life

移动用户就是这样:移动。与始终固定在墙上的台式电脑甚至通常由固定用户使用的笔记本电脑不同,移动用户不会全天为设备充电。移动用户希望他们的设备充电后可持续使用至少 24 小时。

Mobile users are just that: mobile. Unlike desktop computers which are tethered to the wall at all times, and even laptop computers which are generally used by stationary users, mobile users do not recharge their devices throughout the day. Mobile users expect their devices to last at least 24 hours between recharging.

虽然大多数用户意识到通话和 GPS 使用会消耗电池电量,但他们没有意识到不同的网站会比其他网站更快地耗尽电池电量。您可能已经注意到,拔掉电源后,CPU 使用率会耗尽笔记本电脑的电池。CPU 使用率同样会耗尽移动设备的电池!管理 CPU 使用情况。避免重涂。最小化 JavaScript 的大小和活动。始终使用 CSS,而不是 JavaScript 来制作动画。而且,即使支持,也永远不要将 WebGL 提供给移动设备。

While most users realize that calls and GPS usage consume battery power, they don’t realize that different websites will drain their battery faster than other sites. You may have noticed that CPU usage drains the battery on your laptop when unplugged. CPU usage drains the battery on your mobile device just as effectively! Manage CPU usage. Avoid repaints. Minimize both size and activity of your JavaScript. Always use CSS, rather than JavaScript for animations. And, even when supported, never serve WebGL to a mobile device.

如果您未接通电源,任何使您的笔记本电脑转动、预热或打开计算机风扇的因素也会耗尽电池电量。请记住,您的移动设备用户未接通电源!

Anything that makes your laptop churn, warm up, or turn your computer’s fan on also drains the battery if you’re not plugged in. Remember, your mobile device users are not plugged in!

潜伏

Latency

下载和上传速度不等于 ISP 销售的带宽。所引用的 MBps 实际上是 人们可能希望获得的最快连接。网站(包括标记、样式表、媒体、应用程序脚本和第三方脚本)在我们的设备上的速度受到延迟的影响几乎与 Edge 或 3G 营销术语带宽的影响一样大。

Download and upload speeds are NOT equal to the bandwidth marketed by ISPs. The quoted MBps is actually the fastest connection one could possibly ever hope to get. The speed by which a website, including the markup, stylesheets, media, application scripts, and third-party scripts, makes it onto our devices impacted almost as much by latency as by the bandwidth of the marketing terms of Edge or 3G.

我们不会在这里深入讨论延迟。如果您想更好地了解总体延迟和带宽,请查看Tom Hughes-Croucher 撰写的带宽工程师指南( http://developer.yahoo.com/blogs/ydn/posts/2009/10/a_engineers_gui/ ) ( http://twitter.com/sh1mmer)。(它还描述了一些减少数据包的技巧。)

We won’t dive into latency here. If you want a better understanding of latency and bandwidth in general, check out An Engineer’s Guide to Bandwidth (http://developer.yahoo.com/blogs/ydn/posts/2009/10/a_engineers_gui/) by Tom Hughes-Croucher (http://twitter.com/sh1mmer). (It also describes some tips on reducing packets.)

“移动用户的延迟很严重,因此针对移动设备进行优化的网站应该真正减少其发出的 HTTP 请求数量。请注意,通过 WiFi 上网的移动用户体验的延迟要低得多。” — 菲利普·特利斯 ( http://www.yuiblog.com/blog/2010/04/08/analyzing-bandwidth-and-latency/ )

“Mobile users have terrible latency, so a site optimized for mobile should really reduce the number of HTTP requests it makes. Note that mobile users that surf the Web over WiFi experience far lower latency.” — Phillip Tellis (http://www.yuiblog.com/blog/2010/04/08/analyzing-bandwidth-and-latency/)

重要的是要知道,与系留设备或通过 WiFi 访问互联网的设备相比,延迟对移动设备下载速度的影响要大得多。实际速度更多地与数据包丢失和延迟有关。空气——从移动设备到手机信号塔所经过的数据包——是造成延迟的主要原因。换句话说,您使用 3/4G 的移动用户的带宽已经很低。延迟使他们的网上冲浪体验更加痛苦。

What is important to know is that latency has a much larger impact on download speeds on mobile devices than on tethered devices or devices accessing the Internet via WiFi. Actual speeds have more to do with packet loss and latency. Air—the stuff packets go thru to get from a mobile device to a cell tower—is the main cause of latency. In other words, your mobile users using 3/4G already have low bandwidth. Latency makes their web surfing experience that much more painful.

由于延迟问题,减少 DNS 查找和 HTTP 请求在移动领域至关重要。这将我们引向第一个 Web 性能优化反模式:嵌入样式表和脚本。

Because of latency issues, reducing DNS lookups and HTTP requests is vital in the mobile space. This leads us to the first web performance optimization anti-pattern: embedding stylesheets and scripts.

嵌入 CSS 和 JS:最佳实践?

Embedding CSS and JS: A Best Practice?

加快网站速度的最佳实践 ( http://developer.yahoo.com/performance/rules.html ) 建议将 JavaScript 和 CSS 文件设置为外部文件并使用内容交付网络 (CDN)。然而,外部文件意味着更多的 http 请求,而对静态内容使用 CDN 会增加更多的 DNS 查找和更多的 http 请求。虽然在 HTML 中嵌入 CSS 和 JS 违背了我所拥护的所有最佳实践,但如果做得正确,在首次加载时嵌入脚本可以帮助提高性能。Bing 的移动网站就是一个完美的例子(图 18-1图 18-2)。

Best practices for speeding up your website (http://developer.yahoo.com/performance/rules.html) recommend making your JavaScript and CSS files external and using a content delivery network, or CDN. However, external files mean more http requests, and using CDNs for static content adds both more DNS look ups and more http requests. While embedding CSS and JS in your HTML goes against all best practices I’ve ever espoused, if done correctly, embedding your scripts on first load can help improve performance. Bing’s mobile website is a perfect example (Figure 18-1, Figure 18-2).

首次下载为 203.7 KB,后续下载为 15.3

图 18-1。首次下载为 203.7 KB,后续下载为 15.3

Figure 18-1. First download is 203.7 KB, following download is 15.3

bing手机网站截图

图 18-2。bing手机网站截图

Figure 18-2. Screenshot of bing’s mobile website

正如 Nicholas Zakas ( http://www.slideshare.net/nzakas/mobile-web-speed-bumps ) 所指出的,当您第一次访问 m.bing.com ( http://m.bing.com/ ) 时当您从移动设备上加载时,整个网站将作为单个文件加载。CSS和JS是嵌入的。图像包含在数据 URI 中。移动版 Bing 将所有资源放入一个文件中,只需要一个 http 请求。但是,该单个文件有 200KB。那是巨大的。但是,只有第一次访问 Bing 才会返回这么大的文件。通过利用 localStorage 和 cookie,对 m.bing.com 的每个后续请求都会返回一个大小可管理的文件。虽然第一个请求返回一个巨大的文件,但每个后续请求都会生成大约 15KB 的响应。

As pointed by Nicholas Zakas (http://www.slideshare.net/nzakas/mobile-web-speed-bumps), when you access m.bing.com (http://m.bing.com/) for the first time from your mobile device, the entire site loads as a single file. The CSS and JS are embedded. Images are included at data URIs. Bing for mobile put all their assets into a single file necessitating only a single http request. However, that single file is 200KB. That is huge. However, only the first visit to Bing returns such a large file. By taking advantage of localStorage and cookies, every subsequent request to m.bing.com returns a single file of manageable size. While the first request returns a huge file, every subsequent request produces a response of about 15KB.

Bing 将所需的所有文件嵌入到单个 HTML 文件中。Bing 使用客户端 JavaScript 从原始下载中提取 CSS、JS 和图像,并将 CSS、JS 和图像数据 URI 保存在本地存储中。Bing 将所存储文件的名称保存在 cookie 中。对于每个后续页面请求,cookie 都会通知服务器哪些文件已保存在本地,从而允许服务器确定哪些资产(如果有)需要包含在响应中。这样,后续响应仅包含未保存在本地存储中的脚本、样式和图像(如果有)以及 HTML。

Bing embeds all the files needed into the single HTML file. Using client-side JavaScript, Bing extracts the CSS, JS, and images from the original download, and saves the CSS, JS, and image data URIs in local storage. Bing saves the names of the stored files in a cookie. With every subsequent page request, the cookie informs the server which files are already saved locally, allowing the server to determine which assets, if any, need to be included in the response. In this way, subsequent responses only include scripts, styles, and images not saved in local storage, if any, along with the HTML.

通过使用针对所有 HTML、CSS、JS 和图像的单个 HTTP 请求创建 Web 应用程序来减少移动网站下载中延迟的负面影响,步骤包括以下步骤:

The steps to reducing the negative effects of latency in a mobile site download by making a web app with a single HTTP request for all HTML, CSS, JS, and images include the following steps:

  • 嵌入 CSS 和 JS 以进行首页加载

  • Embedding CSS & JS for first page load

  • 将上述嵌入文件解压并放入LocalStorage

  • Extract and put the above embedded files in LocalStorage

  • 使用提取的嵌入文件的名称设置 cookie

  • Set cookies with the names of the extracted embedded files

  • 在后续请求中,检查服务器端的 cookies

  • On subsequent requests, check the cookies server side

  • 仅根据 cookie 值嵌入新的和缺失的脚本

  • Only embed new and missing scripts based on cookie values

  • 加载时从 localStorage 资源加载文件

  • Load files from localStorage resources on load

注意:如果您想知道为什么此方法可能比简单地下载和缓存文件更有效:此方法不仅通过避免多个 DNS 查找和 HTTP 请求的延迟来提高性能,而且移动设备的缓存更加有限,对于 iOS没有持久记忆。

Note: If you’re wondering why this method may be more efficient than simply downloading and caching files: not only does this method improve performance by avoiding the latency of multiple DNS lookups and HTTP requests, but mobile devices have more limited cache, with iOS having no persistent memory.

从 localStorage 中提取数据会对性能造成影响 ( http://calendar.perfplanet.com/2011/localstorage-read-performance/ )。然而,当谈到移动时,它的影响不如延迟,尤其是带宽有限的延迟。

Pulling data out of localStorage is a performance hit (http://calendar.perfplanet.com/2011/localstorage-read-performance/). When it comes to mobile, however, it is less of a hit than latency, especially latency with limited bandwidth.

记忆

Memory

大多数性能建议都侧重于提高 I/O 速度。仅关注移动领域中完成响应所需的时间是不够的。当涉及到移动和大多数移动设备上的有限内存时,我们还必须管理设备上发生的 情况作为开发人员,我们通常在内存几乎无限的个人计算机上进行开发。然而,移动用户在内存非常有限的设备上运行我们的网站。

Most performance recommendation focus on improving I/O speeds. It is not sufficient to only focus on how long it takes for responses to complete in the mobile space.When it comes to mobile and the limited memory on most mobile devices, we have to also manage what happens on the device. As developers, we generally develop on our personal computers where memory is virtually unlimited. Mobile users, however, are running our sites on devices with very limited memory.

过去 20 年来,个人计算机上的内存几乎呈指数级增长。1997 年,256MB 可能足以在 Pentium II 上运行所有软件。然而,在 2011 年,基本型号(即“慢速”)计算机配备了至少 2GB 的 RAM。iPhone 3G 拥有 128MB 内存。最初的 iPad 有 256MB。速度更快的 HTC Inspire 有 768MB。新型高端智能手机的标准配置是大约 512MB RAM 和 1GHz 处理器。移动设备拥有 2011 年编写的软件,但运行在具有 1997 年台式机内存的设备上。

Memory on personal computers has increased almost exponentially over the past 2 decades. 256MB may have been more than enough to run all software on a Pentium II in 1997. In 2011, however, base model (i.e., “slow”) computers come with at least 2GB of RAM. An iPhone 3G has 128MB of memory. The original iPad has 256MB. The faster HTC Inspire has 768MB. The norm for new, high-end smart phones is around 512MB of RAM with 1GHz processors. Mobile devices have software written in 2011, but run on devices that have the memory of a 1997 desktop.

虽然 512MB 似乎足以运行任何 Web 应用程序,但在管理内存时,重要的是要记住浏览器(和 Web 应用程序)并不是消耗有限 RAM 的唯一进程。操作系统、后台进程和其他打开的应用程序(操作系统和用户启动的)都共享内存。移动设备通常运行许多本机应用程序以及用户安装的应用程序,无论用户是否知情。正在运行的应用程序有很多,包括用户启动的应用程序(如 Twitter、GPS、Facebook)、设备附带但可能在用户不知情的情况下运行的应用程序(如日历和媒体)以及用户下载的应用程序(如“愤怒的小鸟”)。本机操作系统应用程序和所有打开用户通知的应用程序继续在后台运行。具有 512MB RAM 的设备的可用内存可能少于 200MB。在管理内存时,请记住,您的 Web 应用程序最活跃的用户也可能是使用其他移动应用程序的用户。测试时,使用真实世界的设备进行测试。在所有测试设备上运行 Twitter、Facebook 和 Mail 等带有通知的应用程序。

While 512MB may seem large enough to run any web application, in managing memory it is important to remember that the browser (and web application) is not the only process consuming the limited RAM. The operating system, background processes, and other open applications (operating system and user initiated) are all sharing the memory. Mobile devices are generally running many native applications as well as user installed apps, with or without the users knowledge. Running applications are many, including user initiated apps like Twitter, GPS, Facebook, apps that came with the device but may be running unbeknownst to the user, like Calendar and Media, and applications downloaded by the user, like Angry Birds. Native OS applications and all apps with user notifications turned on continue to run in the background. A device with 512MB of RAM likely has less than 200MB of available memory. In managing memory, remember that your web application’s most active users are likely also the ones using other mobile applications. When testing, test with real world devices. Run apps like Twitter, Facebook, and Mail with notifications on all your testing devices.

设备上运行的应用程序数量越多,可用于 Web 应用程序的内存就越少。而且,即使这些应用程序都不占用内存,后台运行的应用程序数量之多也会造成高内存使用情况。高内存使用率会导致 UI 缓慢,当浏览器内存不足时,就会出现内存不足的情况。移动浏览器通常会关闭或崩溃以释放内存。您需要管理 Web 应用程序的内存要求,以确保它们不会减慢移动浏览器或使移动浏览器崩溃。

The greater the number of applications running on a device, the less memory available for your web application. And, even if none of those applications are memory hogs, the sheer number of apps running in the background creates high memory usage conditions. High memory usage causes a slow UI, and when the browser is out of memory, it is out of memory. The mobile browser will generally close or crash to free up memory. You need to manage the memory requirements of your web applications to ensure they don’t slow or crash the mobile browser.

优化图像

Optimize Images

除了避免 CSS 表达式 (YSlow) 和优化图像 (PageSpeed) 之外,性能优化指南还与 I/O 有关,而不是网站在设备上后发生的情况。虽然 gzip 压缩文件有助于提高下载速度,但它对内存管理没有帮助。一旦资产位于设备上,它就不再被压缩。图像会占用内存。超过 1024 像素的图像会导致更大的内存问题。通过提供图像的显示尺寸并按该尺寸压缩图像来减小图像文件大小。有一些工具可供您使用。ImageAlpha ( http://pngmini.com/ ) 可以帮助您将透明 png 转换为完全透明的 8 位 png。Sencha.io (http://www.sencha.com/learn/how-to-use-src-sencha-io/)代理确定用户设备所需的图像大小,并在将图像发送到客户端之前缩小(而不是增大)图像。

Other than avoiding CSS expressions (YSlow) and Optimize images (PageSpeed), the performance optimization guidelines have to do with I/O and not what happens once the site is on the device. While gzipping files helps improve download speed it does not help with memory management. Once the asset is on the device, it is no longer compressed. Images use up memory. Images over 1024px cause greater memory issues. Reduce your image file sizes by serving up the image with the dimensions at which it will be displayed, and by compressing the image at that size. There are a few tools at your disposal. ImageAlpha (http://pngmini.com/) can help convert your transparent pngs into 8-bit pngs with full transparency. The Sencha.io (http://www.sencha.com/learn/how-to-use-src-sencha-io/) proxy determines what size image the user’s device requires and will shrink (not grow) images before sending them to the client.

虽然减小图像文件大小对于 Web 性能一直很重要,但在移动设备方面,我们不能只关注 I/O 文件大小。由于内存有限,您必须考虑未压缩的图像文件有多大。所有图像都会占用内存。合成图像使用 GPU 内存而不是 CPU 内存。因此,虽然这可能是释放一些内存的巧妙技巧,但合成图像占用的内存是非合成图像的四倍,因此请谨慎使用此技巧。

While reducing image file size has always been important for web performance, when it comes to mobile, we can’t focus only on the I/O file size. You have to consider how large the image file is uncompressed as memory is limited. All images use up memory. Composited images use GPU memory instead of CPU memory. So, while that may be a neat trick to free up some memory, composited images use up four times the memory of their non-composited counterparts, so use this trick sparingly.

我建议将您随时使用的 Web 应用程序文件(JS、CSS、HTML 和当前显示的图像)控制在 80MB 以下。

I recommend keeping your web application files at use at any one time (JS, CSS, HTML, and images currently displayed) to under 80MB.

权衡 CSS 的好处

Weigh the Benefits of CSS

CSS 可以帮助减少 HTTP 请求的数量并减小所发出请求的大小。使用渐变、边框半径、框和文本阴影以及边框图像,可以大大减少 HTTP 请求的数量。CSS的好处在于效果有:

CSS can help reduce the number of HTTP requests and reduce the size of the requests that are made. With gradients, border-radius, box and text shadow, and border images, you can greatly reduce the number of HTTP requests. The benefits of CSS is that effects are:

  • 需要更少的 HTTP 请求

  • Requiring fewer HTTP requests

  • 可更新

  • Updatable

  • 可扩展

  • Scalable

  • 可过渡

  • Transitionable

  • 可动画化

  • Animatable

然而,将这些效果绘制到屏幕上会产生相关成本。有时 png、gif 和 jpeg 的渲染速度比 CSS 效果更快并且使用的内存更少。任何可变形的 CSS 功能通常都会在每次回流和重绘时进行评估,从而耗尽内存。与 CSS 生成的图像不同,PNG、JPEG 和 GIF 图像作为位图进行渲染和转换,通常使用更少的内存(但更多的 HTTP 请求)。例如,阴影(尤其是嵌入阴影)会保留在内存中并重新绘制,即使被另一个具有较高 z-index 的元素混淆也是如此。而且,虽然径向渐变可能需要 140 个 CSS 字符,但浏览器将绘制整个渐变并将其保留在内存中,而不仅仅是视口中显示的渐变部分。

However, painting these effects to the screen has associated costs. Sometimes pngs, gifs, and jpegs render faster and use less memory than CSS effects. Any CSS features that is transformable is generally evaluated at each reflow and repaint, using up memory. PNG, JPEG, and GIF images, unlike CSS-generated images, are rendered and transitioned as bitmaps, often using less memory (but more HTTP requests). For example, shadows, especially inset shadows, are kept in memory and are repainted even if obfuscated by another element with a higher z-index. And, while a radial gradient may take 140 characters of CSS, the browser will paint and keep in memory the entire gradient, not just the section of gradient that is displayed in the viewport. I recommend using linear gradients and native rounded corners over images, but weigh the performance of radial gradients and inset shadows against the cost of downloading image.

权衡 CSS 的好处。虽然与使用 PhotoShop 和上传导出图片相比,CSS 图像通常是首选解决方案,但由于内存使用和渲染缓慢,某些 CSS 功能具有隐藏成本。

Weigh the benefits of CSS. While CSS images are generally the preferred solution over using PhotoShop and uploading exported pictures, some CSS features have hidden costs due to memory usage and rendering slowness.

GPU 的优点和缺点

GPU Benefits and Pitfalls

在某些设备上,通过将元素转换或变换到 3D 空间,该元素可以进行硬件加速 ( http://www.html5rocks.com/en/tutorials/speed/html5/#toc-hardware-accell )。通过将元素的渲染从 CPU 转移到 GPU,可以极大地提高性能,尤其是在制作动画时。然而,translate3D并不是万能的!硬件加速元素被合成。组合元素占用四倍的内存量。使用 GPU 代替 CPU 将在一定程度上提高性能。虽然硬件加速元素占用的 RAM 较少,但它们确实会占用视频内存,因此请div { transform: translateZ(0); } 谨慎使用该技巧。

On some devices, by transitioning or transforming an element into a 3D space, the element is hardware accelerated (http://www.html5rocks.com/en/tutorials/speed/html5/#toc-hardware-accell). By transferring the rendering of the element from the CPU to the GPU, you can greatly improve performance, especially when animating. However, translate3D is not a panacea! Hardware-accelerated elements are composited. Composited elements take up four times the amount of memory. Using GPU instead of CPU will improve performance up to a point. While hardware-accelerated elements use up less RAM, they do use up video memory, so use the div { transform: translateZ(0); } trick sparingly.

视口:看不见并不意味着心不在焉

Viewport: Out of Sight Does Not Mean Out of Mind

手机视口是可视屏幕区域。与滚动内容的桌面浏览器不同,在移动设备上,除非设置了视口高度和宽度,并且禁用了缩放,否则视口是固定的,用户会在下面移动内容。视口是用户查看内容的“端口”。为什么这是一个性能问题?大多数人没有意识到绘制到页面的内容,即使在当前视口中不可见,仍然在内存中。

The mobile phone viewport is the viewable screen area. Unlike your desktop browser where you scroll content, on mobile devices unless the viewport height and width are set, and scaling is disabled, the viewport is fixed and the user moves the content underneath. The viewport is a “port” through which your users view your content. Why is this a perfermance issue? Most don’t realize that the content that is drawn to the page, even if it is not visible in the current viewport, is still in memory.

最小化 DOM

Minimize the DOM

每次回流时,都会测量每个 DOM 节点。桌面上的 CPU 可以处理几乎无限数量的节点(它最终会崩溃)。移动设备上的内存有限,并且垃圾收集各不相同,因此并不完全可靠。为了提高性能,请尽量减少节点数量。不要分配 DOM 节点并销毁它们(或忘记销毁它们),而是池化并重用您的节点。例如,如果您要创建纸牌游戏,请创建不超过 52 个节点,重复使用池节点,而不是每次将纸牌添加回游戏时创建一个新节点。

Every time there is a reflow, every DOM node is measured. The CPU on your desktop can handle a virtually endless number of nodes (it will eventually crash). The memory on mobile devices is limited and garbage collection differs so is not fully reliable. To improve performance, minimize the number of nodes. Instead of allocating DOM nodes and destroying them (or forgetting to destroy them), pool and reuse your nodes. For example, if you’re creating a card game, create no more than 52 nodes, reusing pooled nodes instead of creating a new node every time a card is added back into the game.

正如您从 JavaScript 最佳实践中知道的那样,通过读取或写入来接触 DOM 的成本很高。缓存 DOM 查找并将它们存储在变量中。

As you know from JavaScript best practices, touching the DOM with a read or write is expensive. Cache DOM lookups and store them in variables.

此外,单独批处理 DOM 查询和 DOM 操作,通过在更新 DOM 之前完全更新 DOM 之外的内容来最大限度地减少 DOM 操作。

Also, batch DOM queries and DOM manipulations separately, minimizing DOM manipulations by updating content fully outside of the DOM before updating the DOM.

在管理内存时,图像优化、CSS 渲染和 DOM 节点计数并不是唯一需要关注的问题。这些只是桌面领域在关注性能时不一定要考虑的要点。

When it comes to managing memory, image optimization, CSS rendering, and DOM node count are not the only points of concern. These are just points that are not necessarily considered in the desktop space when focusing on performance.

用户界面响应能力

UI Responsiveness

移动浏览器是单线程的(http://www.nczonline.net/blog/2010/08/10/what-is-a-non-blocking-script/)。在这方面,移动浏览器与桌面浏览器类似。但由于设备的限制,移动设备有所不同。管理 JavaScript 始终很重要。由于电池使用和内存的原因,臃肿且低效的 JavaScript 在移动设备上的问题甚至更大。

Mobile browsers are single threaded (http://www.nczonline.net/blog/2010/08/10/what-is-a-non-blocking-script/). In that respect, mobile browsers are similar to desktop browsers. Mobile devices are different though because of the limitations of the device. It is always important to manage your JavaScript. Bloated and inefficient JavaScript is even more problematic on mobile devices because of battery usage and memory.

移动设备上的 UI 响应能力不仅仅是单线程能力。由于延迟,浏览器在选择操作后可能会出现挂起的情况,因为往返可能需要一段时间。在采取操作后 200 毫秒内提供用户反馈非常重要。如果您显示或隐藏某个元素,则无需提供反馈,因为应用程序将做出响应。但是,如果您的用户必须等待 UI 更新的往返,请提供反馈以表明您的网站正在响应。

There is more to UI responsiveness on mobile than just single-threaded-ness. Because of latency, the browser may appear to hang after selecting an action because it can take a while for the round trip. It is important to provide user feedback within 200ms after an action is taken. If you are showing or hiding an element, there’s no need to provide feedback, since the app will be responsive. However, provide feedback to indicate that your site is responding if your user has to wait for a round trip for a UI update.

此外,由于移动设备是触摸设备,并且“双击”是潜在的用户动作,因此移动设备实际上会等待潜在的双击才响应用户动作。在 iOS 设备上,touchend 事件发生后默认等待 300 毫秒,然后再采取任何操作。因此,您可能希望通过向 touchend 事件添加事件侦听器来选择默认事件(例如点击),以使您的应用程序响应更快。

In addition, because the mobile device is a touch device, and “double tap” is a potential user action, mobile devices actually waits for potential double taps before responding to user action. On iOS devices there is a default 300ms wait after the touchend event before any action is taken. Because of this, you may want to co-opt default events like the tap with by adding an event listener to the touchend event to make your application more responsive.

概括

Summary

前面的内容并不是确保良好的移动 UI 性能时需要考虑的主题的详尽列表,但应该是一个好的开始。请记住,移动设备是我们用户中增长最快的部分。不要忽视他们。

The preceding is not an exhaustive list of topics to consider in ensuring good mobile UI performance, but should be a good start. Remember that mobile is the fastest increasing segment of our users. Don’t ignore them.

作为开发人员,我们测试了我们的网站,以确保我们遵循雅虎的 YSlow 和 Google 的 PageSpeed 推荐的要点和目标。我们已经使用我们的桌面浏览器进行了测试和测试。我们假设 Web 性能优化指南可以提高所有浏览器的 Web 应用程序性能,无论我们的用户是通过笔记本电脑、iPad、Android 手机还是 Wii 访问该网站。而且,在很大程度上,确实如此。但请记住,当涉及到移动设备时,众所周知且受到关注的优化指南并不是我们唯一关心的问题。

As developers, we’ve tested our websites to make sure we’ve followed the points and goals recommended by Yahoo’s YSlow, and Google’s PageSpeed. We’ve tested and tested… using our desktop browsers. We’ve assumed the web performance optimization guidelines improves web application performance for all browsers, whether our users are accessing the site on their laptop, iPad, Android phone, or even their Wii. And, to a great extent, it does. But remember that the well known and heeded optimization guidelines aren’t our only concern when it comes to mobile.

请继续测试您的网站,但请确保在移动设备上进行测试。模拟器不是模拟器。模拟器不会模拟内存限制,也不会模拟打开 100 个应用程序的设备。在真实场景中的真实设备上进行测试(打开 WiFi 并在后台挂起许多未关闭的应用程序进行测试)。

Do continue testing your website, but make sure to test on mobile devices. Emulators are not simulators. The emulator does not simulate memory constraints and does not simulate the device with 100 apps open. Test on real devices in real scenarios (turn the WiFi and test with many, many unclosed apps hanging in the background).

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/mobile-ui-performance-considerations/。最初发布于 2011 年 12 月 18 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/mobile-ui-performance-considerations/. Originally published on Dec 18, 2011.

第 19 章:停止使用 Google Analytics 网站速度报告浪费时间

Chapter 19. Stop Wasting Your Time Using the Google Analytics Site Speed Report

亚伦 ·彼得斯

Aaron Peters

自 2011 年 5 月起,Google Analytics 中的网站速度报告会显示您的页面为真实访问者加载的速度。Google Analytics 通过使用导航计时 API ( http://w3c-test.org/webperf/specs/NavigationTiming/ ) 在所有支持它的浏览器(IE9+、Chrome、FF7+、Android4+ ( http://caniuse .com/#search=导航)) 并回退到旧版 IE 和 Firefox 的 Google 工具栏数据。在 GA 中拥有页面速度数据非常棒,因为您可以轻松地将其与跳出率和转化率关联起来,从而获得出色的、可操作的见解,从而带来更快的网站、更满意的用户和更多收入。但是,如果您的访问者中有很大一部分使用 Firefox 7 或 8,您很可能会浪费大量时间来解释网站速度数据,甚至会浪费更多时间采取错误的操作。

Since May 2011 the Site Speed report in Google Analytics shows how fast your pages load for your real visitors. Google Analytics measures page load time by using the Navigation Timing API (http://w3c-test.org/webperf/specs/NavigationTiming/) in all browsers that support it (IE9+, Chrome, FF7+, Android4+ (http://caniuse.com/#search=navigation)) and falls back to Google Toolbar data for older versions of IE and Firefox. Having page speed data in GA is great, because you can easily correlate it to bounce rate and conversion, resulting in great, actionable insight that down the road leads to a faster site, happier users, and more revenues. But if a significant percentage of your visitors use Firefox 7 or 8, you may very well be wasting a lot of time interpreting the Site Speed data and even more time taking the wrong actions.

问题:Firefox 导航计时 API 实现中的一个错误

Problem: A Bug in Firefox Implementation of the Navigation Timing API

Firefox 在 2011 年 9 月 27 日发布的版本 7 中实现了导航计时 API。从那天起,该浏览器在该 API 的实现中出现了一个错误。您可以在 Bugzilla 上的错误通知单 ( https://bugzilla.mozilla.org/show_bug.cgi?id=691547 ) 中阅读所有相关信息。问题是 的值window.performance.timing.navigationStart可能太低,这意味着它距离过去太远了。Google Analytics 使用一个简单的公式来计算页面加载时间:loadTime = window.performance.timing.loadEventStart - window.performance.timing.navigationStart。如果 navigationStart太低,页面加载时间将会太长。

Firefox implemented the Navigation Timing API in version 7, which was released on September 27, 2011. From that day in that browser, there has been a bug in the implementation of that API. You can read all about it in the bug ticket (https://bugzilla.mozilla.org/show_bug.cgi?id=691547) on Bugzilla. The problem is that the value for window.performance.timing.navigationStart can be too low, which means it is too far in the past. Google Analytics uses a simple formula to calculate page load time: loadTime = window.performance.timing.loadEventStart - window.performance.timing.navigationStart. If navigationStart is too low, the page load time will be too high.

我在 GA 网站速度报告中发现这个错误对页面加载时间有很大影响。在我客户的一个网站上,27% 的访问者使用 Firefox 7 或 8,24% 使用 Chrome 15 或 16。网站速度报告显示,Firefox 用户的平均页面加载时间为 7.23 秒,Chrome 用户的平均页面加载时间为 3.12 秒。当放大单个页面和日期时,我经常看到所有大峰值(30、50 或 100 秒以上的加载时间)都来自 Firefox。从来没有 Chrome,从来没有 IE,总是 Firefox。

I see this bug affecting page load times in GA Site Speed report a lot. On one of my client’s site, 27% of visitors use Firefox 7 or 8 and 24% use Chrome 15 or 16. The Site Speed report shows that the average page load time for Firefox users is 7.23 seconds and for Chrome it is 3.12 seconds. When zooming in on individual pages and dates, I often see that all the big spikes (30, 50, or 100+ seconds load times) are coming from Firefox. Never Chrome, never IE, always Firefox.

至少有一家商业 Web 应用程序性能监控服务提供商已针对此错误采取了行动。New Relic 向我证实,他们不使用 Firefox 中的导航计时 API 来计算页面加载时间,因为它不准确。

At least one commercial web application performance monitoring service provider has taken action on this bug. New Relic confirmed to me that they don’t use the Navigation Timing API in Firefox to calculate page load time because it is not accurate.

那么,如何才能不让这个 bug 弄乱 GA 中的数据呢?

So, what can you do to not have this bug mess up your data in GA?

解决方案:在 Google Analytics 中过滤掉 Firefox 计时

Solution: Filter Out the Firefox Timings in Google Analytics

在 Google Analytics 中,创建自定义报告并过滤掉来自 Firefox 访问者的所有数据(图 19-1)。

In Google Analytics, create a Custom Report and filter out all data from Firefox visitors (Figure 19-1).

Google Analytics 中的自定义报告

图 19-1。Google Analytics 中的自定义报告

Figure 19-1. Custom Report in Google Analytics

好消息:该错误已在 Firefox 9 中修复

Good News: The Bug Was Fixed in Firefox 9

Mozilla 在 2011 年 12 月 20 日发布的 Firefox 9 中修复了该错误 ( https://wiki.mozilla.org/Releases#Firefox_9 )。现在大多数访问者已经升级到 Firefox 12,您可以删除 Google Analytics 中的过滤器。

Mozilla fixed the bug in Firefox 9, which was released on December 20, 2011 (https://wiki.mozilla.org/Releases#Firefox_9). Now that most visitors have upgraded to Firefox 12, you can remove the filter(s) in Google Analytics.

结束语

Closing Remark

您可能已经了解这个问题。在 Google Analytics 在线帮助的此页面 ( http://support.google.com/analytics/bin/answer.py?hl=en&answer=1205784 ) 中,几乎在页面底部有一条注释提到了 Firefox漏洞。Google 在此暗示,自 11 月 16 日以来,该错误一直在影响网站速度报告中的加载时间。我不知道为什么。据我所知,该错误从第一天(9 月 27 日)就存在于 FF 7 中,并且也存在于 Firefox 8 中。在我看来,Google Analytics 团队应该就此撰写一篇博客文章,而不仅仅是在在线帮助中提及,因为许多 GA 用户可能从未看过在线帮助。

You may already have known about this issue. In the Google Analytics Online Help, on this page (http://support.google.com/analytics/bin/answer.py?hl=en&answer=1205784), there is a note almost at the bottom of the page mentioning the Firefox bug. Google implies here that the bug has been impacting load times in the Site Speed report since November 16. I have no idea why. As far as I know, the bug has been in FF 7 from day one (September 27) and exists in Firefox 8 as well. In my opinion, the Google Analytics team should have written a blog post about this, and not merely mentioned it in the Online Help, where many GA users probably never look.

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/stop-waisting-your-time-using-the-google-analytics-site-speed-report/。最初发布于 2011 年 12 月 19 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/stop-waisting-your-time-using-the-google-analytics-site-speed-report/. Originally published on Dec 19, 2011.

第 20 章 超越 Web 开发工具:Strace

Chapter 20. Beyond Web Developer Tools: Strace

托尼· 詹蒂科尔

Tony Gentilcore

丰富的开发人员工具适用于所有现代 Web 浏览器。它们通常易于使用,并且可以提供优化网页所需的所有信息。很少需要超越 Web Inspector 的时间轴面板 ( http://www.webkit.org/blog/1091/more-web-inspector-updates/#timeline_panel ) 的统一网络/脚本/渲染视图。

Rich developer tools are available for all modern web browsers. They are typically easy to use and can provide all the information necessary to optimize web pages. It is rare to need to go beyond the unified networking/scripting/rendering view of the Web Inspector’s Timeline panel (http://www.webkit.org/blog/1091/more-web-inspector-updates/#timeline_panel).

但它们并不总是完美的:一个工具可能会丢失信息,可能与另一个工具不一致,或者可能只是不正确。例如,最近的错误(https://bugs.webkit.org/show_bug.cgi?id=58354)偶尔会导致两次导航计时(https://dvcs.w3.org/hg/webperf/raw-file/tip /specs/NavigationTiming/Overview.html)指标在 Chrome(和 Inspector)中不正确。

But they aren’t always perfect: a tool may be missing information, may disagree with another tool, or may just be incorrect. For instance, a recent bug (https://bugs.webkit.org/show_bug.cgi?id=58354) occasionally caused two Navigation Timing (https://dvcs.w3.org/hg/webperf/raw-file/tip/specs/NavigationTiming/Overview.html) metrics to be incorrect in Chrome (and the Inspector).

当出现这些罕见的情况时,优秀的工程师能够超越浏览器的开发人员工具,准确地找出浏览器告诉操作系统要做什么。在 Linux 上,可以使用 找到这个终极真理的来源strace。该工具可以跟踪浏览器进行的每个系统调用。由于每个网络和文件访问都需要系统调用,而这正是浏览器花费大量时间的地方,因此它非常适合调试多种类型的浏览器性能问题。

When these rare situations arise, great engineers are able to go beyond a browser’s developer tools to find out exactly what the browser is telling the operating system to do. On Linux, this source of ultimate truth can be found using strace. This tool can trace each system call made by a browser. Since every network and file access entails a system call, and this is where browsers spend a lot of their time, it is perfect for debugging many types of browser performance issues.

其他平台呢?

What About Other Platforms?

在这篇文章中,我介绍 strace,因为语法很干净并且不需要任何设置。但大多数系统都有一个等效的工具来跟踪系统调用。移动开发人员会很高兴听到 Android 完全支持 strace。OS X 用户会发现dtrace 它提供了更强大的功能,但代价是不太直观的语法(不幸的是没有移植到 iOS)。最后,Windows 事件跟踪(ETW) 虽然难以设置,但支持友好的 GUI。

In this post, I introduce strace because the syntax is clean and no setup is required. But most systems have an equivalent tool for tracing system calls. Mobile developers will be happy to hear that strace is fully supported by Android. OS X users will find dtrace offers more powerful functionality at the expense of less intuitive syntax (unfortunately not ported to iOS). Finally, Event Tracing for Windows (ETW), while harder to set up, supports a friendly GUI.

入门

Getting Started

要使用它:打开终端并strace在命令提示符下调用。此调用会在启动 Google Chrome 到 google.com 时打印所有系统调用:

To use it: open a terminal and invoke strace at the command prompt. This invocation prints all system calls while starting Google Chrome to google.com:

$ strace -f -ttt -T google-chrome http://www.google.com/

$ strace -f -ttt -T google-chrome http://www.google.com/

我添加了-ffollowforks, -ttt来打印每个呼叫的时间戳并-T打印每个呼叫的持续时间。

I’ve added -f to follow forks, -ttt to print the timestamp of each call and -T to print the duration of each call.

归零

Zeroing In

如果您运行前面的命令,您可能会对现代 Web 浏览器中发生的大量内容感到不知所措。要过滤出一些有趣的内容,请尝试使用-e 参数。如果仅检查文件或网络访问,请尝试-e trace=file-e trace=network。手册页(http://linux.die.net/man/1/strace)有更多示例。

If you run the preceding command, you’ll probably be overwhelmed by the amount of stuff going on in a modern web browser. To filter down to something interesting, try using the -e argument. For examining only file or network access, try -e trace=file or -e trace=network. The man page (http://linux.die.net/man/1/strace) has many more examples.

示例:本地存储

Example: Local Storage

作为一个具体示例,让我们跟踪 Chrome 中的本地存储性能。首先,我打开了本地存储配额测试页面(http://arty.name/localstorage.html)。然后,我从 Chrome 的任务管理器(扳手 > 工具 > 任务管理器)检索 Chrome 浏览器进程的 ID,并使用开关将 strace 附加到该进程-p

As a concrete example, let’s trace local storage performance in Chrome. First I opened a local storage quota test page (http://arty.name/localstorage.html). Then I retrieved the Chrome browser processes’ ID from Chrome’s task manager (Wrench > Tools > Task Manager) and attached strace to that process using the -p switch.

$ strace -f -T -p _<process id>_ -e trace=open,read,write

$ strace -f -T -p _<process id>_ -e trace=open,read,write

open输出显示每个、read和系统调用的时间戳、参数和返回值 write。每个调用的手册页都解释了参数和返回值。我们感兴趣的第一个电话是open

The output shows the timestamps, arguments and return value of every open, read, and write system call. The man page for each call explains the arguments and return values. The first call of interest to us is this open:

open("/home/tonyg/.config/google-chrome/Default/Local Storage/http_arty.name_0.localstorage-journal", O_RDWR|O_CREAT, 0640) = 114 <0.000391>

open("/home/tonyg/.config/google-chrome/Default/Local Storage/http_arty.name_0.localstorage-journal", O_RDWR|O_CREAT, 0640) = 114 <0.000391>

这向我们表明 Chrome 已打开此文件进行读取和写入(并且可能创建了它)。文件名是一个重要线索,表明这是 arty 网页保存本地存储的位置。返回值114是文件描述符,它将在以后的读写中识别它。现在我们可以查找 readwrite调用对 fd 114 进行操作的函数,例如:

This shows us that Chrome has opened this file for reading and writing (and possibly created it). The name of the file is a big clue that this is where local storage is saved for arty’s web page. The return value, 114, is the file descriptor, which will identify it in later reads and writes. Now we can look for read and write calls which operate on fd 114, for example:

write(114, "\0\0\00020\0001\0002\0003\0004\0005\0006\0007\0008\0009\0000\0001\0002\0003\0"..., 1024 <unfinished ...> <... write resumed> ) = 1024 <0.425476>

write(114, "\0\0\00020\0001\0002\0003\0004\0005\0006\0007\0008\0009\0000\0001\0002\0003\0"..., 1024 <unfinished ...> <... write resumed> ) = 1024 <0.425476>

这两行显示将以上面的字符串开头的 1,024 字节数据写入到本地存储文件 (114)。这次写入恰好花费了 425 毫秒。请注意,该调用被分成两行,中间可能还有其他行,因为另一个线程抢占了它。这对于像这样的较慢的调用来说很常见。

These two lines show a 1,024 byte write of the data beginning with the string above to the local storage file (114). This write happened to take 425ms. Note that the call is split into two lines with possibly others in between because another thread preempted it. This is common for slower calls like this.

我们只触及了表面

We’ve Only Scratched the Surface

有一些选项可以转储从网络或文件系统读取/写入的完整数据。运行 with-c 显示有关最常见调用所用时间的聚合统计信息。我还发现一些实用的 python 脚本可以快速将这些跟踪解析为各种有用的格式。

There are options for dumping the full data read/written from the network or filesystem. Running with -c displays aggregate statistic about the time spent in the most common calls. I’ve also found that some practical python scripting can quickly parse these traces into a variety of useful formats.

这个简短的介绍很难公正地描述这个工具。我只是希望当您下次遇到棘手的性能问题时,它能给您提供更深入探索堆栈的勇气。

This brief introduction hardly does this tool justice. I merely hope it provides the courage to explore deeper into the stack the next time you run into a tricky performance problem.

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/beyond-web-developer-tools-strace/。最初发布于 2011 年 12 月 20 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/beyond-web-developer-tools-strace/. Originally published on Dec 20, 2011.

第 21 章 mod_spdy 简介:Apache HTTP 服务器的 SPDY 模块

Chapter 21. Introducing mod_spdy: A SPDY Module for the Apache HTTP Server

布莱恩 ·麦奎德和马修·斯蒂尔

Bryan McQuade and Matthew Steele

在 Google,我们致力于让整个网络变得更快。我们在这一领域的工作包括Page Speed Onlinemod_pagespeedPage Speed ServiceGoogle Chrome使 TCP 更快)SPDY 协议等。SPDY(发音为“SPeeDY”)协议允许将网站更有效地传输到网络浏览器,从而缩短页面加载时间(http://blog.chromium.org/2009/11/2x-faster-web.html)高达55%。为了使网站更容易实现 SPDY 的优势,我们发布了mod_spdy的源代码,mod_spdy 是 Apache HTTP 服务器的开源模块。

At Google, we strive to make the whole Web fast. Our work in this area includes Page Speed Online, mod_pagespeed, Page Speed Service, Google Chrome, making TCP faster, and the SPDY protocol, among other efforts. The SPDY (pronounced “SPeeDY”) protocol allows websites to be transmitted more efficiently to the web browser, resulting in page load time improvements (http://blog.chromium.org/2009/11/2x-faster-web.html) of as much as 55%. To make it easier for websites to realize the benefits of SPDY, we’re releasing the source code for mod_spdy, an open-source module for the Apache HTTP server.

mod_spdy 入门

Getting Started with mod_spdy

mod_spdy 仍处于早期测试阶段,尚不建议在生产环境中部署。如果您想测试 mod_spdy 并帮助我们改进它,请参阅我们的入门指南。我们希望在 2012 年初的某个时候将其投入生产。请订阅我们的讨论论坛,敬请关注。

mod_spdy is still in early beta, and is not yet recommended for deployment in production environments. If you’d like to test out mod_spdy and help us to make it better, please consult our Getting Started guide. We hoped to make it production-ready sometime in early 2012. Stay tuned by subscribing to our discussion forum.

SPDY 和阿帕奇

SPDY and Apache

mod_spdy 是一个与 Apache 2.2 兼容的模块,为 Apache HTTP 服务器提供 SPDY 支持。多路复用是 SPDY 的一个重要性能特征,它允许在单个 SPDY 会话中同时处理多个请求,并且它们的响应沿线路交错。然而,由于 HTTP/1.1 协议的序列化性质,Apache HTTP 服务器提供了一种每个连接一个请求的架构。Apache 的连接和请求处理通常发生在单个线程中,如图21-1所示。

mod_spdy is an Apache 2.2-compatible module that provides SPDY support for Apache HTTP servers. Multiplexing is an important performance feature of SPDY which allows for multiple requests in a single SPDY session to be processed concurrently, and their responses interleaved down the wire. However, due to the serialized nature of the HTTP/1.1 protocol, the Apache HTTP server provides a one-request-per-connection architecture. Apache’s connection and request processing normally happens in a single thread, like shown on Figure 21-1.

Apache的连接和请求处理

图 21-1。Apache的连接和请求处理

Figure 21-1. Apache’s connection and request processing

这对于 HTTP 来说效果很好,但对于像 SPDY 这样的多路复用协议来说却存在问题,因为在此流程中,每个连接一次只能处理一个请求。一旦 Apache 开始处理请求,控制权就会转移到请求处理程序,并且在请求完成之前不会返回到连接处理程序。

This works well for HTTP, but it presents a problem for multiplexed protocols like SPDY because in this flow, each connection can only process one request at a time. Once Apache starts processing a request, control is transferred to the request handler and does not return to the connection handler until the request is complete.

为了允许 SPDY 多路复用,mod_spdy 将连接处理和请求处理分离到不同的线程中。连接线程负责解码 SPDY 帧并将新的 SPDY 请求分派到 mod_spdy 请求线程池。每个请求线程可以同时处理不同的 HTTP 请求。图 21-2中的图表显示了高级架构。

To allow for SPDY multiplexing, mod_spdy separates connection processing and request processing into different threads. The connection thread is responsible for decoding SPDY frames and dispatching new SPDY requests to the mod_spdy request thread pool. Each request thread can process a different HTTP request concurrently. The diagram on Figure 21-2 shows the high-level architecture.

高层架构

图 21-2。高层架构

Figure 21-2. High-level architecture

要了解有关 mod_spdy 如何在 Apache 中工作的更多信息,请参阅我们的 wiki ( http://code.google.com/p/mod-spdy/wiki/HowItWorks )。

To learn more about how mod_spdy works within Apache, consult our wiki (http://code.google.com/p/mod-spdy/wiki/HowItWorks).

帮助改进 mod_spdy

Help to Improve mod_spdy

您可以通过进行兼容性和性能测试、查看代码来帮助我们改进 mod_spdy ( http://code.google.com/p/mod-spdy/source/browse/trunk/src#src%2Fmod_spdy%2Fcommon )并向我们​​发送有关 mod_spdy 讨论列表的反馈 ( https://groups.google.com/group/mod-spdy-discuss )。我们期待您的贡献和反馈!

You can help us to make mod_spdy better by doing compatibility and performance testing, by reviewing the code (http://code.google.com/p/mod-spdy/source/browse/trunk/src#src%2Fmod_spdy%2Fcommon) and sending us feedback on the mod_spdy discussion list (https://groups.google.com/group/mod-spdy-discuss). We look forward to your contributions and feedback!

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/introducing-mod_spdy-a-spdy-module-for-the-apache-http-server/。最初发布于 2011 年 12 月 21 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/introducing-mod_spdy-a-spdy-module-for-the-apache-http-server/. Originally published on Dec 21, 2011.

第 22 章 CommonJS 模块的惰性求值

Chapter 22. Lazy Evaluation of CommonJS Modules

托比 ·兰格尔

Tobie Langel

大约两年前,移动 Gmail 团队发表了一篇文章,重点讨论减少 HTML5 的启动延迟 ( http://googlecode.blogspot.com/2009/09/gmail-for-mobile-html5-series-reducing.html )应用。它描述了一种技术,可以通过将其放置在注释中来绕过 JavaScript 的解析和评估,直到需要它为止。SproutCore ( http://sproutcore.com/ ) 的Charles Jolley ( http://www.okito.net/ ) 很快就接受了这个想法。他对此进行了实验(http://blog.sproutcore.com/faster-loading-through-eval/),发现通过将代码放在字符串中而不是对其进行注释,可以实现类似的性能提升。然后,尽管做出了承诺(http://www.okito.net/post/8409610016/on-sproutcore-2-0)将其构建到 SproutCore 中时,这项技术几乎被遗忘了。这很遗憾,因为它是延迟加载的一个有趣的替代方案,非常适合 CommonJS 模块。

About two years ago, the mobile Gmail team posted an article focused on reducing the startup latency (http://googlecode.blogspot.com/2009/09/gmail-for-mobile-html5-series-reducing.html) of their HTML5 application. It described a technique which enabled bypassing parsing and evaluation of JavaScript until it was needed by placing it inside comments. Charles Jolley (http://www.okito.net/) of SproutCore (http://sproutcore.com/) fame was quick to jump on the idea. He experimented with it (http://blog.sproutcore.com/faster-loading-through-eval/) and found that similar performance gains could be achieved by putting the code inside of a string rather then commenting it. Then, despite promises (http://www.okito.net/post/8409610016/on-sproutcore-2-0) of building it into SproutCore, this technique pretty much fell into oblivion. That’s a shame because it’s an interesting alternative to lazy loading that suits CommonJS modules really well.

文本/JavaScript 类型的近距离接触

Close Encounters of the Text/JavaScript Type

script为了理解这种技术的工作原理,让我们看看当浏览器的解析器遇到具有有效属性的元素时会发生什么src。首先,向服务器发送请求。希望服务器响应并且浏览器继续下载(并缓存)所请求的文件。完成这些步骤后,仍然需要解析和评估文件(图 22-1)。

To understand how this technique works, let’s look at what happens when the browser’s parser encounters a script element with a valid src attribute. First, a request is sent to the server. Hopefully the server responds and the browser proceeds to download (and cache) the requested file. Once these steps are completed, the file still needs to be parsed and evaluated (Figure 22-1).

未缓存的 JavaScript 资源获取、解析和评估

图 22-1。未缓存的 JavaScript 资源获取、解析和评估

Figure 22-1. Uncached JavaScript resource fetching, parsing, and evaluation

为了进行比较,图 22-2显示了相同的请求命中热 HTTP 缓存的情况。

For comparison, Figure 22-2 shows the same request hitting a warm HTTP cache.

缓存的 JavaScript 资源获取、解析和评估

图 22-2。缓存的 JavaScript 资源获取、解析和评估

Figure 22-2. Cached JavaScript resource fetching, parsing, and evaluation

这里值得注意的是,除了缓存的明显好处之外,JavaScript 文件的解析和评估仍然会在每次页面加载时发生,而与缓存无关。虽然这些步骤在现代台式电脑上速度非常快,但在移动设备上却不然。即使在最近的高端设备上也是如此。考虑图 22-3中的图表,它比较了在 iPhone 3、4、4S、iPad、iPad 2、Nexus S 和 MacBook Pro 上解析和评估 jQuery 的成本。(请注意,这些结果仅供参考。它们是使用lazyeval.org ( http://lazyeval.org/ )托管的测试收集的,目前该测试仍处于 alpha 阶段。)

What’s worth noticing here—other than the obvious benefits of caching—is that parsing and evaluation of the JavaScript file still happen on every page load, regardless of caching. While these steps are blazing fast on modern desktop computers, they aren’t on mobile. Even on recent, high-end devices. Consider the graph in Figure 22-3, which compares the cost of parsing and evaluating jQuery on the iPhone 3, 4, 4S, iPad, iPad 2, a Nexus S, and a MacBook Pro. (Note that these results are indicative only. They were gathered using the test hosted at lazyeval.org (http://lazyeval.org/), which at this point is still very much alpha.)

解析和评估 jQuery

图 22-3。解析和评估 jQuery

Figure 22-3. Parsing and evaluating jQuery

请记住,这些时间是在您已经面临的任何网络成本之上的。此外,无论文件是否被缓存,它们都会在每个页面加载时产生。是的,您没看错。在 iPhone 4 上,每次加载页面时解析和评估 jQuery 需要 0.3 秒以上。可以说,随着更新的设备的出现,这些结果有了显着的改善,但你不能指望你的整个用户群都拥有上一代智能手机,不是吗?

Remember that these times come on top of whatever networking costs you’re already facing. Furthermore, they’ll be incurred on every single page load, regardless of whether or not the file was cached. Yes, you’re reading this right. On an iPhone 4, parsing and evaluating jQuery takes over 0.3 seconds, every single time the page is loaded. Arguably, those results have substantially improved with more recent devices, but you can’t count on your whole user base owning last generation smartphones, can you?

延迟加载

Lazy Loading

对于启动延迟问题,通常建议的解决方案是按需加载脚本(例如,在用户交互之后)。这种技术的主要优点是,它延迟了下载、解析和评估的成本,直到需要脚本为止。请注意,在实践中,除非您可以延迟所有JavaScript 文件,否则您最终将不得不支付两次往返费用(图 22-4)。

A commonly suggested solution to the problem of startup latency is to load scripts on demand (for example, following a user interaction). The main advantage of this technique is that it delays the cost of downloading, parsing, and evaluating until the script is needed. Note that in practice—and unless you can delay all your JavaScript files—you’ll end up having to pay round trip costs twice (Figure 22-4).

延迟加载 JavaScript

图 22-4。延迟加载 JavaScript

Figure 22-4. Lazy-loading JavaScript

然而,这种方法有许多缺点。首先,不保证代码能够交付:网络或服务器可能同时变得不可用。其次,代码传输的速度取决于网络质量,因此差异很大。最后,代码是异步交付的。这些缺点迫使开发人员在构建时既要考虑到防御性又要考虑异步性,从而在过程中将实现与交付机制不可挽回地联系在一起。除非整个代码库构建在这些前提上(这可能是您想要避免的事情),否则延迟加载一段代码将成为一项艰巨的任务。

There are a number of downsides to this approach, however. First of all, the code isn’t guaranteed to be delivered: the network or the server can become unavailable in the meantime. Secondly, the speed at which the code is transferred is subject to the network’s quality and can thus vary widely. Lastly, the code is delivered asynchronously. These downsides force the developer to build both defensively and with asynchronicity in mind, irremediably tying the implementation to it’s delivery mechanism in the process. Unless the whole codebase is built on these premises—which is probably something you want to avoid—deferring the loading of a chunk of code becomes a non-trivial endeavor.

惰性评估来拯救

Lazy Evaluation to the Rescue

惰性求值通过仅关注延迟解析和求值阶段来完全避免这些问题。该脚本可以与初始负载捆绑或内联。通过注释掉或转义并转换为字符串(“字符串化”?),可以防止在初始页面加载期间对其进行评估。在这两种情况下,只需在需要时评估内容(图 22-5)。

Lazy evaluation avoids these issues altogether by focusing on delaying the parsing and evaluation stages only. The script can be either bundled with the initial payload or inlined. It is prevented from being evaluated during initial page load by being either commented-out or escaped and turned into a string (“stringified”?). In both cases, the content is simply evaluated when required (Figure 22-5).

惰性评估

图 22-5。惰性评估

Figure 22-5. Lazy evaluation

再次,为了进行比较,命中了一个热 HTTP 缓存(图 22-6

And again, for comparison, hitting a warm HTTP cache is shown on (Figure 22-6)

缓存脚本的延迟评估

图 22-6。缓存脚本的延迟评估

Figure 22-6. Lazy evaluation of a cached script

正如 iPad 2 解析和评估 jQuery 的图表所示(图 22-7),这两个选项的性能始终优于常规评估至少十倍。在所有测试设备上都观察到类似的十倍性能改进。

As the graph of an iPad 2 parsing and evaluating jQuery shows (Figure 22-7), both options consistently out-perform regular evaluation by at least a factor of ten. Similar tenfold performance improvements were observed on all tested devices.

在 Pad 2 中解析和评估 jQuery

图 22-7。在 Pad 2 中解析和评估 jQuery

Figure 22-7. Parsing and evaluating jQuery in Pad 2

注释掉的代码的性能指数比“字符串化”代码稍好。然而,如果不内联,提取起来可能会非常复杂。它也更脆弱:众所周知,一些电话运营商会删除 JavaScript 注释 ( http://www.mysociety.org/2011/08/11/mobile-operators-breaking-content/ )。另一方面,“字符串化”代码更加健壮并且更容易访问,这就是它受到青睐的原因。

Commented-out code has slightly better performance indices than “stringified” code does. It can however be quite complicated to extract when not inlined. It is also more brittle: some phone operators are known to strip out JavaScript comments (http://www.mysociety.org/2011/08/11/mobile-operators-breaking-content/). “Stringified” code, on the other hand is both more robust and a lot easier to access, that’s why its preferred.

将惰性求值构建到 CommonJS 模块中

Building Lazy Evaluation into CommonJS Modules

事实证明,CommonJS 模块 ( http://wiki.commonjs.org/wiki/Modules/1.1 ) 的额外间接级别(调用require )使其成为惰性评估的理想候选者。由于惰性求值是同步的,因此整个过程对开发人员来说完全透明。启用惰性计算成为配置文件中的一行,而不是一个大的架构更改。更好的是,可以利用通过静态分析构建的依赖关系图来自动延迟评估所有选定模块的依赖关系。

It turns out that the CommonJS module’s (http://wiki.commonjs.org/wiki/Modules/1.1) extra level of indirection (the require call) makes it an ideal candidate for lazy evaluation. Since lazy evaluation is synchronous, the whole process can be made completely transparent to the developer. Enabling lazy evaluation becomes a one-liner in a config file, not a large architectural change. Even better, the dependency graph built through static analysis can be leveraged to automatically lazy evaluate all the selected module’s dependencies.

在实现方面,启用 CommonJS 模块的惰性评估需要修改运行时,以便正确评估和包装以“字符串化”形式传输的模块。在我的 CommonJS 模块依赖项解析器 modulr ( https://github.com/tobie/modulr-node/ ) 中,这样做是这样的 ( https://github.com/tobie/modulr-node/blob/v0.6.1 /assets/modulr.sync.js#L26-29 ):

Implementation-wise, enabling lazy evaluation of CommonJS modules requires modifying the runtime so that it correctly evaluates and wraps modules which are transported in their “stringified” form. In modulr (https://github.com/tobie/modulr-node/), my CommonJS module dependencies resolver, this is done like so (https://github.com/tobie/modulr-node/blob/v0.6.1/assets/modulr.sync.js#L26-29):

if (typeof fn === 'string') {
  fn = new Function('require', 'exports', 'module', fn);
}
if (typeof fn === 'string') {
  fn = new Function('require', 'exports', 'module', fn);
}

这意味着惰性评估模块将被转义(https://github.com/tobie/modulr-node/blob/v0.6.1/lib/collector.js#L56-61)并用引号括起来(https://github.com) /tobie/modulr-node/blob/v0.6.1/lib/collector.js#L76)在服务器端构建时、传输之前。

This implies lazy evaluated modules be escaped (https://github.com/tobie/modulr-node/blob/v0.6.1/lib/collector.js#L56-61) and surrounded by quotes (https://github.com/tobie/modulr-node/blob/v0.6.1/lib/collector.js#L76) at build time on the server-side, before transport.

初步结果令人鼓舞,但目前还只是正在进行中的工作。modulr 的未来计划包括完全缩小其输出(仅缩小输出是不行的,因为它会错过作为字符串传输的模块)、检测运行时以能够收集性能数据并尝试受 Souders 启发的每个模块 localStorage缓存(http://www.stevesouders.com/blog/2011/09/26/app-cache-localstorage-survey/)。如果有兴趣,我还想自动化lazyeval.org(http://lazyeval.org/),以允许它测量将此技术应用于其他JavaScript库的性能增益,并将这些结果报告给browserscope.org(http: //www.browserscope.org/)。

The initial results are promising, but at this point, it is merely work in progress. Future plans for modulr include enabling full minification of it’s output (just minifying the output won’t do as it would miss modules transported as strings), instrumenting the runtime to be able to gather perf data and experimenting with a Souders-inspired per module localStorage cache (http://www.stevesouders.com/blog/2011/09/26/app-cache-localstorage-survey/). If there’s interest, I’d also like to automate lazyeval.org (http://lazyeval.org/) to allow it to measure performance gain of applying this technique onto other JavaScript libraries and reporting those results to browserscope.org (http://www.browserscope.org/).

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/lazy-evaluation-of-commonjs-modules/。最初发布于 2011 年 12 月 22 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/lazy-evaluation-of-commonjs-modules/. Originally published on Dec 22, 2011.

第23章.关于信任建议的建议

Chapter 23. Advice on Trusting Advice

比利 ·霍夫曼

Billy Hoffman

我们都知道第三方内容意味着您不再控制影响页面加载时间的所有因素。由于第三方内容的问题,一个时尚、经过精心调整和优化的网站仍然可能提供较差的用户体验。Steve Souders 甚至曾经发布过一系列博客文章 ( http://stevesouders.com/p3pc/ ),他在其中分析和评估了第三方内容片段的性能 ( http://www.stevesouders.com/blog/2010 ) /02/17/第三方内容的性能/)。(亲爱的史蒂夫,请把这个带回来,太棒了)。Mathias Bynens 更进一步,展示了如何额外优化 Google 的标记和 JavaScript 片段 ( http://mathiasbynens.be/notes/async-analytics-snippet )。

We all know that third-party content means you no longer control all the factors which affect page load time. A sleek, well-tuned, and optimized site can still deliver a poor user experience because of problems with third-party content. Steve Souders even used to publish a series of blog posts (http://stevesouders.com/p3pc/) where he analyzed and rated the performance of third-party content snippets (http://www.stevesouders.com/blog/2010/02/17/performance-of-3rd-party-content/). (Dear Steve, please bring this back, it was awesome). Mathias Bynens took this one step further, showing how to additionally optimize Google’s markup and JavaScript snippets (http://mathiasbynens.be/notes/async-analytics-snippet).

从 Steve 和 Mathias 身上学到的令人惊讶的教训是,如果您想要一个快速的网站第三方小部件,那么您需要检查第三方内容是否存在性能问题,即使片段来自值得信赖的 Web 性能权威机构。所以这篇文章实际上并不是关于第三方内容。这将是关于信任建议。

The surprising lesson to learn from Steve and Mathias is that if you want a fast site and third-party widgets, then you need to examine the third-party content for performance problems, even when a snippet comes from a trusted authority on web performance. So this post isn’t really going to be about third-party content. It’s going to be about trusting advice.

上周,Zoompf 的客户、在线贵金属交易所 GoldMoney ( http://goldmoney.com/ ) 就我们的技术在其网站上标记的问题联系了支持人员。我们检测到 Google+ 按钮的 Google JavaScript 库存在问题。Zoompf WPO 建议客户做一些与 Google 的建议相矛盾的事情。这足以让 GoldMoney 犹豫不决。

Last week a Zoompf customer, the online precious metal exchange GoldMoney (http://goldmoney.com/), contacted Support about an issue our technology flagged on their site. We had detected an issue with Google’s JavaScript library for their Google+ button. Zoompf WPO was suggesting the customer do something which was contradicting Google’s advice. And that was enough to give GoldMoney pause.

plusone.jsZoompf 标记的具体问题是,从非 SSL 页面使用 SSL 引用Google 的库 ( http://zoompf.com/blog/2010/03/zoompf-check-300-or-gateways-got-a -问题)。SSL 很重要,因为如果使用得当(https://www.owasp.org/images/4/40/Ivan_Ristic_ - Breaking_SSL-_OWASP.pdf),它提供通信隐私和完整性。然而,CSS 文件或 JavaScript 库,甚至是使用启用 SSL 的超链接从不通过 SSL 提供服务的 HTML 页面引用的 Favicon,很可能不包含需要保护的信息。由于 SSL 提供这些安全功能的代价是降低 Web 性能(如下所述),因此仅在必要时才使用 SSL 非常重要。

The specific issue that Zoompf was flagging was that Google’s plusone.js library was being referenced using SSL from a non-SSL page (http://zoompf.com/blog/2010/03/zoompf-check-300-or-gateways-got-a-problem). SSL is important because, if used properly (https://www.owasp.org/images/4/40/Ivan_Ristic_-Breaking_SSL-_OWASP.pdf), it provides communications privacy and integrity. However, a CSS file, or JavaScript library, or even a Favicon that is referenced using a SSL-enabled hyperlink from an HTML page which is not served over SSL most likely does not contain information that needs protecting. Since SSL provides these security features with a cost of a decrease in web performance (as discussed later), it is important to only use SSL when you have to.

在这种情况下,Googleplusone.js 按钮库不包含个人或私人信息。http://Zoompf 的建议是使用而不是 来检索 Google+ 库https://。以下是 Google 文档的内容(已添加重点):

In this case, the Google plusone.js button library does not contain personal or private information. Zoompf’s suggestion was to instead retrieve the Google+ library using http:// instead of https://. Here is what Google’s documentation has to say (emphasis added):

+1 按钮代码需要来自 Google 服务器的脚本。http://通过在通过 加载的页面上包含脚本 via ,您可能会收到此错误https://。我们建议使用 https://包含脚本:<script type="text/javascript" src="https://apis.google.com/js/plusone.js"></script>。如果您的网页使用https://,则当通过 http:// 调用页面上的任何资源时,某些浏览器和验证工具会显示错误。如果您的网站通过 https:// 提供页面,请确保这些页面上的 +1 按钮代码也使用 https://。(事实上​​,可以https://在所有页面的按钮代码中使用,即使它们仅通过 提供 http://。)

The +1 button code requires a script from Google’s servers. You may get this error by including the script via http:// on a page that’s loaded via https://. We recommend using https:// to include the script: <script type="text/javascript" src="https://apis.google.com/js/plusone.js"></script>. If your web page uses https://, some browsers and verification tools will show an error when any assets on the page are called via http://. If your site serves pages via https://, make sure that the +1 button code on those pages also uses https://. (In fact, it’s fine to use https:// in the button code for all pages, even if they are only served via http://.)

谷歌试图避免的“错误”是混合内容警告。它看起来如图 23-1所示:

The “error” that Google is trying to avoid is a mixed content warning. It looks like the one shown in Figure 23-1:

混合内容警告

图 23-1。混合内容警告

Figure 23-1. Mixed content warning

当使用 HTTP 提供 HTTPS 引用的 HTML 页面时,会出现混合内容警告。由于现代浏览器中存在一些严重的设计缺陷 ( http://code.google.com/p/browsersec/wiki/Part2 ),混合内容可能会允许 DOM、cookie、引用 URL、会话 ID 等特权信息被泄露。被不受信任的各方访问。浏览器通常会显示一个令人困惑的对话框,或者只是无法呈现页面,具体取决于其安全设置。Google 的解决方案是始终plusjone.js使用 SSL 请求文件,即使不需要 SSL 时也是如此。

A mixed content warning happens when an HTML page is served with HTTPS references using HTTP. Due to some serious design flaws (http://code.google.com/p/browsersec/wiki/Part2) in modern browsers, mixed content can allow privileged information like the DOM, cookies, referrer URLs, session IDs, and more to be access by untrusted parties. Browsers usually display a confusing dialog box or just fail to render the page, depending on its security settings. Google’s solution to avoid all is to just always request the plusjone.js file using SSL, even when SSL is not needed.

但仅仅为了好玩而使用 SSL 并不是一个好主意。SSL 通过多种方式对 Web 性能产生负面影响:

But using SSL, just for the fun of it, is not a good idea. SSL impacts web performance negatively in several ways:

简而言之,SSL 很棒,但它不是免费的。如果不需要,请不要使用它。

In short, SSL is great but it’s not free. Don’t use it if you don’t have to.

这里的解决方案是实际使用协议相对 URL ( http://blog.httpwatch.com/2010/02/10/using-protocol-relative-urls-to-switch- Between-http-and-https / ) 。协议相关 URL 是一种引用不同主机名上的资源的方法,无需指定使用什么协议来检索。所以 src="https://apis.google.com/js/plusone.js"你可以使用src="//apis.google.com/js/plusone.js". 考虑一个使用协议相关 URL 来引用的 HTML 页面plusone.js。如果使用 提供页面https://,则plusone.js使用 请求页面https://。安全性得到维护,不会出现混合内容安全警告。如果使用 提供页面http://,则将使用 HTTP 提供库。不会发生性能影响,也不会出现缓存问题。

The solution here is to actually use a protocol-relative URL (http://blog.httpwatch.com/2010/02/10/using-protocol-relative-urls-to-switch-between-http-and-https/). A protocol-relative URL is a way of referencing a resource on a different host name without specifying what protocol to use to retrieve. So instead of src="https://apis.google.com/js/plusone.js" you can use src="//apis.google.com/js/plusone.js". Consider an HTML page which uses a protocol-relative URL to reference plusone.js. If the page was served using https://, then plusone.js is requested using https://. Security is maintained and no mixed content security warning will appear. If the page was served using http://, then the library will be served using HTTP. No performance hit happens and no caching issues come up either.

现在,我知道您可能在想什么:“Stoyan 真的允许某个人在性能日历上讨论 11 个段落的协议相关 URL 了吗?” 是的,我确实谈到了一些很多人不熟悉的很酷的东西,它为一个令人惊讶的常见问题提供了一个优雅的解决方案。(事实上​​,还有很多其他关于协议相关 URL 的内容可以讨论,比如非标准 IE6 配置会导致奇怪的证书错误,或者 IE7 和 IE8 中的双重下载错误。所以算你自己幸运吧!)前面说过,协议相关 URL 的魔力不是本章的重点。

Now, I know what you might be thinking: “Did Stoyan seriously allow some guy a spot on the Performance Calendar to talk about protocol relative URLs for eleven paragraphs?” Well yes, I did talk about something cool that many people are not familiar with and that provides an elegant solution to a surprising common problem. (In fact, there tons of other stuff to talk about with protocol relative URLs, like a non-standard IE6 configuration which causes a weird certificate error, or the double downloading bug in IE7 and IE8. So count yourself lucky!) But as I said earlier, the magic of protocol-relative URLs is not the point of this chapter.

本章的要点是您需要谨慎对待性能建议。不仅仅是你从哪里得到它,而是它说要做什么。谷歌太棒了。他们是当今业界网络性能最强有力的支持者之一。但没有人是完美的。Mathias 改进了他们的 Google Analytics 代码片段。他们的 Google 涂鸦始终是高品质的 JPEG,毫无必要地浪费带宽 ( https://twitter.com/zoompf/status/144920292446306305 )。有时,就像在本例中一样,他们的建议并不完全正确。正如佛陀曾经说过的:

The point of chapter is that you need to be careful about performance advice. Not just where you get it, but what it says to do. Google is awesome. They are one of the strongest supporters of web performance in the industry today. But no one is perfect. Mathias improved upon their Google Analytics snippet. Their Google Doodles are always ludicrously high quality JPEGs that needlessly waste bandwidth (https://twitter.com/zoompf/status/144920292446306305). And sometimes, like in this case, their advice is not just right. As the Buddha once said:

不要相信任何事情,无论你在哪里读到它,或者是谁说的,即使是我说的,也不要相信,除非它符合你自己的理性和你自己的常识。

Believe nothing, no matter where you read it, or who has said it, not even if I have said it, unless it agrees with your own reason and your own common sense.

在将来自第三方的代码片段添加到您的网站之前,您应该始终检查它,无论是谁编写的,即使是 Steve Souders、Douglas Crockford 或 John Resig 编写的,以确保它不会违反您所确定的任何最佳实践。已经知道。

You should always examine a code snippet from a third-party before including it in your site, regardless of who wrote it, even if Steve Souders or Douglas Crockford or John Resig wrote it, to make sure it does not violate any best practices that you already know.

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/advice-on-trusting-advice/。最初发布于 2011 年 12 月 23 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/advice-on-trusting-advice/. Originally published on Dec 23, 2011.

第 24 章 为什么你可能会错误地解读你的绩效衡量结果(至少你有很好的伙伴)

Chapter 24. Why You’re Probably Reading Your Performance Measurement Results Wrong (At Least You’re in Good Company)

约书亚· 比克斯比

Joshua Bixby

2011 年我最喜欢的书之一是诺贝尔奖获得者心理学家丹尼尔·卡尼曼 (Daniel Kahneman) 写的《思考,快与慢》 。卡尼曼在他的书中指出了我们头脑中不断交战的两种思想体系:

One of my favorite books of 2011 was Thinking, Fast and Slow by the Nobel Prize-winning psychologist Daniel Kahneman. In his book, Kahneman identifies the two systems of thought that are constantly warring inside our heads:

  • 系统 1,快速且直观

  • System 1, which is fast and intuitive

  • 系统2,缓慢且逻辑性强

  • System 2, which is slow and logical

系统 1 几乎总是有缺陷,但我们却无可奈何地依赖它。我们也有一种痛苦的倾向,认为我们正在将系统 2 应用于我们的思维,而事实上它只是系统 1 的智力升级版本。

Almost invariably, System 1 is flawed, yet we helplessly rely on it. We also have a painful tendency to think we’re applying System 2 to our thinking, when in fact it’s just an intellectually tarted up version of System 1.

卡尼曼对这种想法进行了一个巧妙的小测试:

Kahneman offers a nifty little test of this thinking:

某个城镇有两家医院。较大的医院每天约有 45 个婴儿出生,较小的医院每天约有 15 个婴儿出生。如您所知,大约 50% 的婴儿是男孩。然而,确切的百分比每天都在变化。有时可能高于50%,有时则较低。在一年的时间里,每家医院都记录了超过 60% 的婴儿出生是男孩的日子。您认为哪家医院记录了更多这样的日子?

A certain town is served by two hospitals. In the larger hospital about 45 babies are born each day, and in the smaller hospital about 15 babies are born each day. As you know, about 50% of all babies are boys. However the exact percentage varies from day to day. Sometimes it may be higher than 50%, sometimes lower. For a period of 1 year, each hospital recorded the days on which more than 60% of the babies born were boys. Which hospital do you think recorded more such days?

  1. 较大的医院

  2. The larger hospital

  3. 规模较小的医院

  4. The smaller hospital

  5. 大致相同(即彼此相差 5% 以内)

  6. About the same (that is, within 5% of each other)

正确答案是B,较小的医院。但正如卡尼曼指出的那样,“当向一些本科生提出这个问题时,22% 的人回答 A;22% 认为 B;56% 的人认为 C。抽样理论表明,小医院中 60% 以上的婴儿是男孩的预期天数比大医院要长得多,因为大样本不太可能偏离 50 %。统计的这一基本概念显然不属于人们的直觉。”

The correct answer is B, the smaller hospital. But as Kahneman notes, “When this question was posed to a number of undergraduate students, 22% said A; 22% said B; and 56% said C. Sampling theory entails that the expected number of days on which more than 60% of the babies are boys is much greater in the small hospital than in the large hospital, because the large sample is less likely to stray from 50%. This fundamental notion of statistics is evidently not part of people’s repertoire of intuition.”

但这些只是一群吃奶酪的本科生,对吧?这不适用于我们的社区,因为我们都是伟大的直觉统计学家?如果计算机科学学位不能让你立即有效地掌握统计数据,那它还有什么意义呢?

But these are just a bunch of cheese-eating undergrads, right? This doesn’t apply to our community, because we’re all great intuitive statisticians? What was the point of that computer science degree if it didn’t allow you a powerful and immediate grasp of stats?

考虑到卡尼曼的发现,我决定自己进行一个小测试,看看普通的友好社区网络性能专家分析统计数据的能力如何。(为了保护无辜者,身份已被隐藏。)当然,鉴于样本量较小,您可以质疑我的测试的有效性。如果你不这样做我会很失望。

Thinking about Kahneman’s findings, I decided to conduct a little test of my own to see how well your average friendly neighborhood web performance expert is able to analyze statistics. (Identities have been hidden to protect the innocent.) Of course, you’re allowed to call into question the validity of my test, given its small sample size. I’d be disappointed if you didn’t.

方法论

The Methodology

我请我们社区中 10 名非常资深且德高望重的成员回答上述医院问题。我还请他们对这个小测试的结果发表评论。

I asked 10 very senior and well-respected members of our community to answer the hospital question, above. I also asked them to comment on the results of this little test.

图 24-1中显示的 RUM 结果捕获了 IE9 和 Chrome 16 大型电子商务网站特定产品页面上一天的活动。您会从该表中得出什么结论?

The RUM results shown on Figure 24-1 capture one day of activity on a specific product page for a large e-commerce site for IE9 and Chrome 16. What conclusions would you draw from this table?

朗姆酒结果

图 24-1。朗姆酒结果

Figure 24-1. RUM results

结果

The Results

如果您必须总结此表,您可能会得出“Chrome 比 IE9 更快”的结论。这就是你从桌子上看到的故事,你直觉地被它吸引,因为那是你感兴趣的部分。该研究是使用特定产品页面完成的,捕获一天的数据,或包含 Chrome 的 45 个计时样本,这是很好的背景信息,但与整体故事无关。无论样本大小如何,您的摘要都是相同的,尽管荒谬的样本大小(即从两个数据点或 600 万个数据点捕获的结果)可能会引起您的注意。

If you had to summarize this table, you would probably conclude “Chrome is faster than IE9.” That’s the story you take away from looking at the table, and you intuitively are drawn to it because that’s the part that’s interesting to you. The fact the study was done using a specific product page, captures one day of data, or contains 45 timing samples for Chrome is good background information, but isn’t relevant to the overall story. Your summary would be the same regardless of the size of the sample, though an absurd sample size (i.e., results captures from two data points or 6 million data points) would probably grab your attention.

医院问题结果:在医院问题上,我们比本科生好……但好不了多少。我调查的 10 个人中有 5 个人答错了这个问题。

Hospital question results: On the hospital question, we were better than the undergrads… but not by much. 5 out of 10 people I surveyed got the question wrong.

RUM 结果:我对缺乏对数据来源​​的关注感到惊讶。只有两个人指出样本量太小,无法从结果中得出有意义的结论,而且平均值对于此类分析毫无用处。其他八个人都关注 Chrome 比 IE9 更快这一(假定的)事实,他们向我讲述了有关 Chrome 改进的故事以及结果如何代表这些改进。

RUM results: I was amazed at the lack of focus on the source of the data. Only two people pointed out that the sample size was so low that no meaningful conclusions could be drawn from the results, and that averages were useless for this type of analysis. The other eight all focused on the (assumed) fact that Chrome is faster than IE9, and they told me stories about the improvements in Chrome and how the results are representative of these improvements.

结论

Conclusions

表格和描述包含两种信息:故事和故事的来源。我们的自然倾向是关注故事而不是来源的可靠性,最终我们相信我们内心的统计直觉。我一直对我们普遍未能认识到样本量的作用感到惊讶。作为一个物种,我们是糟糕的直觉统计学家。我们对样本大小或如何看待测量不够敏感。

The table and description contain information of two kinds: the story and the source of the story. Our natural tendency is to focus on the story rather than on the reliability of the source, and ultimately we trust our inner statistical gut feel. I am continually amazed at our general failure to appreciate the role of sample size. As a species, we are terrible intuitive statisticians. We are not adequately sensitive to sample size or how we should look at measurement.

为什么这很重要?

Why Does This Matter?

RUM 正以前所未有的速度在企业中得到采用。它正在成为我们的测量基线和最终的事实来源。对于我们这些关心在现实世界中提高网站速度的人来说,这是在与传统综合测试的长期斗争中取得的令人难以置信的胜利(http://www.webperformancetoday.com/2011/07/05/web-performance-测量岛正在下沉/)。

RUM is being adopted in the enterprise at an unprecedented speed. It is becoming our measurement baseline and the ultimate source of truth. For those of us who care about making sites faster in the real world, this is an incredible victory in a long protracted battle against traditional synthetic tests (http://www.webperformancetoday.com/2011/07/05/web-performance-measurement-island-is-sinking/).

我现在经常进入使用 RUM 的企业。虽然我对赢得战争感到非常满意,但现在我们面临着一场重要的战斗。

I now routinely go into enterprises that use RUM. Although I take great satisfaction in winning the war, an important battle now confronts us.

要点

Takeaways

1. 当我们的样本量太小时,我们需要一些工具来警告我们。我们都在高中或大学学习过抽样技术。可以通过相当简单的程序计算任何给定样本量的错误风险。不要使用你的判断,因为它是有缺陷的。我们不仅需要保持警惕,还需要游说工具供应商来帮助我们。当样本量太小时,Google、Gomez、Keynote 和其他公司应该通知我们,特别是考虑到我们很容易出错。

1. We need tools that warn us when our sample sizes are too small. We all learned sampling techniques in high school or university. The risk of error can be calculated for any given sample size by a fairly simple procedure. Don’t use your judgement because it is flawed. Not only do we need to be vigilant but we need to lobby for the tool vendors to help us. Google, Gomez, Keynote, and others should notify us when sample sizes are too small—especially given how prone we are to error.

2. 平均值对于 RUM 结果来说是一个不好的衡量标准。RUM 结果可能会受到显着异常值的影响,这使得平均值在大多数情况下成为糟糕的衡量标准。不幸的是,我所知道的几乎所有现成产品都使用平均值。如果您需要查看一个数字,请查看中位数或第 95 个百分位数。

2. Averages are a bad measure for RUM results. RUM results can suffer from significant outliers, which make averages a bad measure in most instances. Unfortunately, averages are used in almost all of the off-the-shelf products I know. If you need to look at one number, look at medians or 95th percentile numbers.

3. 直方图是绘制数据图表的最佳方式。通过直方图,您可以看到性能测量值的分布,并且与平均值不同的是,您可以发现可能会扭曲结果的异常值。例如,我对同一页面进行了 500,000 个页面加载时间测量的数据集。如果我采用所有这些样本的平均加载时间,我会得到大约 6600 毫秒的页面加载时间。现在查看页面所有测量值的直方图(图 24-2 )。像这样在直方图中可视化测量结果更具洞察力,并且可以告诉我们更多有关该页面的性能概况的信息。

3. Histograms are the best way to graph data. With histograms you can see the distribution of performance measurements and, unlike averages, you can spot outliers that would otherwise skew your results. For example, I took a dataset of 500,000 page load time measurements for the same page. If I went with the average load time across all those samples, I’d get a page load time of ~6600msec. Now look at the histogram (Figure 24-2) for all the measurements for the page. Visualizing the measurements in a histogram like this is much much more insightful and tells us a lot more about the performance profile of that page.

直方图可视化

图 24-2。直方图可视化

Figure 24-2. Histogram visualization

(如果您想知道,整个数据集的页面加载时间中位数约为 5350 毫秒。这可能是页面性能的更准确指标,并且比平均值要好得多,但不如直方图那样能告诉我们正确的信息可视化性能概况。事实上,在 Strangeloop,我们通常会同时查看中值和性能直方图来了解全貌。)

(If you’re wondering, the median page load time across the data set is ~5350msec. This is probably a more accurate indicator of the page performance and much better than the average, but is not as telling as the histogram that lets us properly visualize the performance profile. As a matter of fact, here at Strangeloop, we usually look at both median and the performance histogram to get the full picture.)

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/good-company/。最初发布于 2011 年 12 月 24 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/good-company/. Originally published on Dec 24, 2011.

第 25 章有损图像压缩

Chapter 25. Lossy Image Compression

谢尔盖 ·切尔内雪夫

Sergey Chernyshev

图像是网络上最古老的项目之一(紧随 HTML 之后),自从我们开始使用它们以来,图像的变化仍然很小。是的,除了原始的 GIF 之外,我们现在还获得了 JPEG 和 PNG,但除此之外,没有太多改进可以使它们变得更好。

Images are the one of the oldest items on the Web (right after HTML) and still so little has changed since we started to use them. Yes, we now got JPEG and PNG in addition to original GIF, but other then that, there were not many improvements to make them better.

也就是说,如果你不算创造它们的大量创意人才,事实上它创造了我们现在所知道的网络,闪亮且充满营销潜力!没有图像,我们就没有构建 Web 的工作,没有图像,我们就不会担心 Web 性能,因为没有用户关心体验,也没有业务人员为改进付费。

That is, if you don’t count lots of creative talent that went into creating them, so much in fact that it created the Web as we know it now, shiny and full of marketing potential! Without images we wouldn’t have the job of building the Web, and without images we wouldn’t worry about web performance because there would be no users to care about experience and no business people to pay for improvements.

话虽这么说,我们网站上的图像是通过网络来回发送的最大有效负载,在很大程度上降低了用户体验。

That being said, images on our websites are the largest payload sent back and forth across the wires of the Net taking a big part in slowing down user experience.

根据 HTTPArchive(图 25-1http://httparchive.org/interesting.php#bytesperpage),JPEG、GIF 和 PNG 占总体页面大小的 63% ,总体图像大小与总体页面加载时间有 0.64 的相关性(图 25-2http://httparchive.org/interesting.php#onLoad)。

According to HTTPArchive (Figure 25-1, http://httparchive.org/interesting.php#bytesperpage), JPEGs, GIFs and PNGs account for 63% of overall page size and overall image size has 0.64 correlation with overall page load time (Figure 25-2, http://httparchive.org/interesting.php#onLoad).

按内容类型划分的平均字节数

图 25-1。按内容类型划分的平均字节数

Figure 25-1. Average bytes by content type

与加载时间的相关性

图 25-2。与加载时间的相关性

Figure 25-2. Correlation to load times

尽管如此,我们仍然可以放心地假设,我们将拥有更多的图像,并且它们只会变得更大,以及台式计算机上的屏幕分辨率。

Still we can safely assume that we are going to have only more images and they will only grow bigger, along with the screen resolutions on desktop computers.

有损压缩

Lossy Compression

有几种不同的方法可以优化图像,包括压缩、分割、选择适当的格式、调整大小等。处理图像还有许多其他方面,包括后加载、缓存、URL 版本控制、CDN 等。

There are a few different ways to optimize images including compression, spriting, picking appropriate format, resizing and so on. There are many other aspects of handling images that include postloading, caching, URL versioning, CDNs and etc.

在本文中,我想重点讨论有损压缩,其中图像的质量特征发生变化,而不会给用户带来显着的视觉差异,但性能会发生显着变化

In this article I wanted to concentrate on lossy compression where quality characteristics of the images are changed without significant visual differences for the user, but with significant changes to performance.

到目前为止,我们大多数人都熟悉无损压缩,这要感谢 Stoyan ( http://www.phpied.com/ ) 和 Nicole ( http://www.stubbornella.org/ ),他们首先向我们介绍了图像优化使用名为 Smush.it ( http://www.smushit.com/ysmush.it/ )(现在由 Yahoo! 运行)的出色在线工具来提高 Web 性能。例如,现在还有一些其他工具具有类似的 PNG 功能。

By now most of us are familiar with loss-less compression, thanks to Stoyan (http://www.phpied.com/) and Nicole (http://www.stubbornella.org/) who first introduced us to image optimization for web performance with an awesome on-line tool called Smush.it (http://www.smushit.com/ysmush.it/) (now run by Yahoo!). There are a few other tools now that have similar functionality for PNG, for example.

使用 smush.it,图像质量可以保持原样,仅删除不必要的元数据,通常可以节省高达 30-40% 的文件大小。这是一个安全的选择,当您这样做时图像将完好无损。这似乎是唯一的出路,特别是对于您的设计部门来说,他们相信一旦图像从计算机中出来,它就是神圣的,并且必须保持完全相同。

With smush.it, image quality is preserved as is with only unnecessary meta-data removed, it often saves up to 30-40% of file size. It is a safe choice and images will be intact when you do that. This seems the only way to go, especially for your design department who believe that once an image comes out of their computers it is sacred and must be preserved absolutely the same.

事实上,图像的质量并不是一成不变的——JPEG 是作为一种允许以牺牲质量为代价缩小尺寸的格式而发明的。Web 因图像而流行,如果它们是在 JPEG 之前占主导地位的 BMP、TIFF 或 PCX 格式,那么 Web 就不会出现。

In reality, quality of the image is not set in stone—JPEG was invented as a format that allowed for size reduction at a price of quality. Web got popular because of images, it wouldn’t be here if they were in BMP, TIFF, or PCX formats that were dominating prior to JPEG.

JPEG 质量设置

图 25-3。JPEG 质量设置

Figure 25-3. JPEG quality settings

这就是为什么我们需要真正开始使用 JPEG 的质量可调功能。如果您使用照片编辑器的导出功能,您甚至可能在设置中看到它 -图 25-3 是 Adob​​e Photoshop 中“导出到网络和设备”屏幕的质量调整部分的屏幕截图。

This is why we need to actually start using this feature of JPEG where quality is adjustable. You probably even saw it in settings if you used export functionality of photo editors—Figure 25-3 is a screenshot of quality adjusting section of “export for web and devices” screen in Adobe Photoshop.

质量设置范围从 1 到 100,其中 75 通常足以满足所有照片的需要,其中一些照片即使值为 30 看起来也足够好。在 Photoshop 和其他工具中,您通常可以用自己的眼睛看到差异并进行适当调整,从而使确保质量永远不会低于某个点,这主要取决于图像。

Quality setting ranges from 1 to 100 with 75 usually being enough for all photos with some of them looking good enough even with the value of 30. In Photoshop and other tools, you can usually see the differences using your own eyes and adjust appropriately, making sure quality never degrades below certain point, which mainly depends on the image.

生成的图像大小在很大程度上取决于图像的原始来源和图片的视觉特征,有时可以节省高达 80% 的大小而不会显着降低。

Resulting image size heavily depends on the original source of the image and visual features of the picture, sometimes saving up to 80% of the size without significant degradation.

我知道这些数字听起来很模糊,但这正是我们所有人在需要自动化图像优化时面临的问题。所有图像都是不同的,如果没有人查看它们,就无法预测固定的质量设置是否会损坏图像或根本无法经常保存它们。不幸的是,在这个过程中聘请人工编辑成本高昂、耗时,有时甚至根本不可能,例如当网站上使用 UGC(用户生成内容)时。

I know these numbers sound pretty vague, but that is exactly the problem that all of us faced when we needed to automate image optimization. All images are different and without having a person looking at them, it’s impossible to predict if fixed quality settings will damage the images or simply not save them often enough. Unfortunately having a human editor in the middle of the process is costly, time-consuming, and sometimes simply impossible, for example when UGC (user-generated content) is used on the site.

自从我看到 smush.it 在无损压缩方面做得很好之后,我就被这个问题困扰了。幸运的是,今年出现了两种可以自动进行有损图像压缩的工具:一种开源工具是我的前同事 Ryan Flynn 专门为 WPO 目的开发的,名为 ImgMin ( https://github.com/rflynn/ imgmin),另一个是名为 JPEGmini(http://www.jpegmini.com/)的商业工具,它是针对消费者照片尺寸缩小而产生的。

I was bothered by this problem since I saw smush.it doing great job for lossless compression. Luckily, this year, two tools emerged that allow for automation of lossy image compression: one open source tool was developed specifically for WPO purposes by my former co-worker, Ryan Flynn, called ImgMin (https://github.com/rflynn/imgmin), and another is a commercial tool called JPEGmini (http://www.jpegmini.com/) which came out of consumer photo size reduction.

我不能代表 JPEGmini,他们的技术(http://www.jpegmini.com/main/technology)是私有的,正在申请专利,但 ImgMin 使用一种简单的方法来尝试不同的质量设置,然后选择具有以下效果的结果:图像差异在一定阈值内。还有一些其他简单的启发式方法,因此有关更多详细信息,您可以阅读 ImgMin 在 Github 上的文档 ( https://github.com/rflynn/imgmin#readme )。

I can’t speak for JPEGmini, their technology (http://www.jpegmini.com/main/technology) is private with patents pending, but ImgMin uses a simple approach of trying different quality settings and then picking the result that has the picture difference within a certain threshold. There are a few other simple heuristics, so for more details you can read ImgMin’s documentation on Github (https://github.com/rflynn/imgmin#readme).

这两个工具都工作得很好,ImgMin 提供了不同的结果,因为它简单但不太精确。JPEGmini 提供专用服务器解决方案,云服务即将推出。

Both of the tools work pretty well, providing different results with ImgMin in its simplicity being less precise. JPEGmini offers dedicated server solution with cloud service coming soon.

图 25-4中,您可以看到我的 Twitter 用户图片以及如何使用无损 (smush.it) 和 loss-y (JPEGmini) 压缩自动优化它。请注意,原始图像和优化图像之间没有明显的质量下降。在较大的照片上,结果也惊人地相似。

In Figure 25-4, you can see my Twitter user pic and how it was automatically optimized using loss-less (smush.it) and loss-y (JPEGmini) compression. Notice no perceivable quality degradation between original and optimized images. Results are astonishingly similar on larger photos as well.

原始(10028 字节)、无损(9834 字节,节省 2%)、有损(4238 字节,节省 58%)

图 25-4。原始(10028 字节)、无损(9834 字节,节省 2%)、有损(4238 字节,节省 58%)

Figure 25-4. Original (10028 bytes), lossless (9834 bytes, 2% savings), lossy (4238 bytes, 58% savings)

这是个好消息,因为它最终将使我们能够自动执行有损压缩,这一直是一个手动过程 - 现在您可以依靠一个工具并将其可靠地构建到您的图像处理管道中!

This is great news as it will finally allow us to automate lossy compression, which was always a manual process—now you can rely on a tool and reliably build it into your image processing pipeline!

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/lossy-image-compression/。最初发布于 2011 年 12 月 25 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/lossy-image-compression/. Originally published on Dec 25, 2011.

第 26 章使用 Selenium 和 JavaScript 进行性能测试

Chapter 26. Performance Testing with Selenium and JavaScript

JP 卡斯特罗

JP Castro

如今,许多网站都采用真实用户监控工具,例如 New Relic ( http://newrelic.com/features/real-user-monitoring ) 或 Gomez ( http://www.compuware.com/application-performance-management/real- user-monitoring.html)来衡量生产应用程序的性能。这些工具通过提供实时指标来提供巨大的价值,并允许工程师识别和解决最终的性能瓶颈。

Nowadays many websites employ real user monitoring tools such as New Relic (http://newrelic.com/features/real-user-monitoring) or Gomez (http://www.compuware.com/application-performance-management/real-user-monitoring.html) to measure performance of production applications. Those tools provide a great value by giving real time metrics and allow engineers to identify and address eventual performance bottlenecks.

这对于实时部署的应用程序来说效果很好,但是对于分阶段设置呢?工程师可能希望在部署到生产之前(也许是在进行质量检查过程时)查看性能。他们可能希望找到可能的性能回归或确保新功能速度很快。然而,分阶段设置可以驻留在公司网络上,从而限制前面提到的 RUM 工具的使用。

This works well for live deployed applications, but what about a staged setup? Engineers might want to look at the performance before deploying to production, perhaps while going through a QA process. They may want to find possible performance regressions or make sure a new feature is fast. The staged setup could reside on a corporate network however, restricting the use of RUM tools mentioned earlier.

那么托管在防火墙环境中的应用程序又如何呢?并非所有 Web 应用程序都公开托管在 Internet 上。有些安装在私人数据中心仅供内部使用(考虑内部网类型的设置)。

And what about an application hosted in a firewalled environment? Not all web applications are publicly hosted on the Internet. Some are installed in private data centers for internal use only (think about an intranet type of setup).

如何观察这些类型场景中的应用程序性能?在本章中,我将解释我们如何利用开源软件来构建我们的性能测试套件。

How can you watch application performance in these types of scenarios? In this chapter, I’ll explain how we leveraged open source software to build our performance test suite.

记录数据

Recording Data

第一步是记录数据。为此,我们使用一些自定义代码来记录在多个层上花费的时间:前端、Web 层、后端 Web 服务和数据库。

The initial step is to record data. For that purpose we use a bit of custom code that records time spent on multiple layers: front end, web tier, backend web services, and database.

我们的 Web 层是传统的服务器端 MVC 应用程序,为浏览器生成 HTML 页面(我们使用 PHP 和 Zend Framework,但这可以适用于任何其他技术堆栈)。

Our web tier is a traditional server-side MVC application that generates an HTML page for the browser (we use PHP and the Zend Framework, but this could apply to any other technology stack).

首先,我们在调用 MVC 框架之前存储服务器端脚本启动的时间:

First, we store the time at which the server side script started, right before we invoke the MVC framework:

<?php
// store script start time in microseconds
define('START_TIME', microtime(TRUE));
?>
<?php
// store script start time in microseconds
define('START_TIME', microtime(TRUE));
?>

其次,当 MVC 框架准备好将页面缓冲回浏览器时,我们插入一些内联 JavaScript 代码,其中包括:

Secondly when the MVC framework is ready to buffer the page back to the browser, we insert some inline javascript code which includes:

  • 捕获的开始时间(“请求时间”)

  • The captured start time (“request time”)

  • 当前时间(“响应时间”)

  • The current time (“response time”)

  • 执行后端调用所花费的总时间(我们如何知道此信息?我们的 Web 服务客户端会跟踪执行 Web 服务调用所花费的时间;并且对于每个 Web 服务响应,后端都会包含执行数据库调用所花费的时间)。

  • The total time spent doing backend calls (How do we know this information? Our web service client keeps track of the time spent doing webservice calls; and with each webservice response, the backend include the time spent doing database calls).

除了这些指标之外,我们还包含一些用于捕获的 jquery 代码:

In addition to those metrics, we include some jquery code to capture:

  • 文档就绪事件时间

  • The document ready event time

  • 窗口onload事件时间

  • The window onload event time

  • 最后一次点击的时间(我们将其存储在 cookie 中以供下一页加载)

  • The time of the last click (which we store in a cookie for the next page load)

换句话说,在我们的 HTML 文档(接近结尾的地方)中,我们有几行 javascript,如下所示:

In other words, in in our HTML document (somewhere toward the end), we have a few lines of javascript that look like this:

<script>
Perf = Perf || {};
Perf.requestTime = <?= START_TIME ?>;
Perf.responseTime = <?= microtime(TRUE) ?>;
Perf.wsTime = <?= $wsTime ?>;
Perf.dbTime = <?= $soapTime ?>;
$(document).ready(function(){
  Perf.readyTime = new Date().getTime()/1000;
});
$(window).bind("load", function(){
  Perf.renderTime = new Date().getTime()/1000;
  Perf.clickTime = getLastClickTime();
});
$(window).bind("unload", function(){
  storeLastClickTime(new Date().getTime()/1000);
});
</script>
<script>
Perf = Perf || {};
Perf.requestTime = <?= START_TIME ?>;
Perf.responseTime = <?= microtime(TRUE) ?>;
Perf.wsTime = <?= $wsTime ?>;
Perf.dbTime = <?= $soapTime ?>;
$(document).ready(function(){
  Perf.readyTime = new Date().getTime()/1000;
});
$(window).bind("load", function(){
  Perf.renderTime = new Date().getTime()/1000;
  Perf.clickTime = getLastClickTime();
});
$(window).bind("unload", function(){
  storeLastClickTime(new Date().getTime()/1000);
});
</script>

最后,我们在 head 标签中插入几行 JavaScript 代码,以便我们可以记录浏览器接收页面的大致时间。正如 Alois Reitbauer 在 Timing the Web ( http://calendar.perfplanet.com/2011/timing-the-web/ )中指出的那样,这是一个近似值,因为它没有考虑 DNS 查找之类的事情。

Finally, we insert a couple more javascript lines in the head tag, so that we can record an approximate time at which the page was received by the browser. As Alois Reitbauer pointed out in Timing the Web (http://calendar.perfplanet.com/2011/timing-the-web/), this is an approximation as it does not account for things like DNS lookups.

<head>
<script>
Perf = Perf || {};
Perf.receivedTime = new Date().getTime()/1000;
</script>
[...]更多代码[...]
</head>
<head>
<script>
Perf = Perf || {};
Perf.receivedTime = new Date().getTime()/1000;
</script>
[...] more code [...]
</head>

现在我们已经有了浏览器中给定请求的一些指标,我们如何检索它们以便我们可以检查它们?

Now that we have some metrics for a given request in the browser, how do we retrieve them so that we can examine them?

收集和分析数据

Collecting and Analyzing the Data

这就是 Selenium 发挥作用的地方。我们使用 Selenium 来模拟一个人使用我们的 Web 应用程序。同样,这与技术无关,因为您可以通过各种语言控制 Selenium(我们使用 PHP 和 PHPUnit,但您可以使用 python 或 ruby​​ 进行相同的操作)。

This is where Selenium comes into play. We use Selenium to simulate a person using our web application. Again this is technology agnostic as you can control Selenium from various languages (we use PHP and PHPUnit, but you could do the same with python or ruby).

Selenium 有一个 API,您可以调用它来调用一些 javascript 片段并获取执行代码的输出。这个 API 称为 getEval.

Selenium has an API that you can call to invoke some javascript snippet and get back the output of the executed code. This API is called getEval.

在我们的测试代码中,我们首先打开一个要分析的页面,然后使用 APIgetEval检索我们记录的指标,最后存储指标以供以后使用。

Within our test code, we first open a page we want to analyze, then use the getEval API to retrieve the metrics we recorded and finish with storing the metrics for later consumption.

class ExampleSeleniumTest extends PHPUnit_Extensions_SeleniumTestCase
{
  public function testLoadSomePage()
  {
    // Open our web application
    $this->open('/');
    // Click a link to load the page we want to analyze
    $this->clickAndWait('Some Page')
    // Use getEval API to retrieve the metrics we recorded
    $metrics = $this->getEval('window.Perf');
    // Call our internal method that will store the metrics for later use
    // Note: we include a reference to the page or to what use case we are testing
    $this->saveMetrics('some-page', $metrics);
  }
}
class ExampleSeleniumTest extends PHPUnit_Extensions_SeleniumTestCase
{
  public function testLoadSomePage()
  {
    // Open our web application
    $this->open('/');
    // Click a link to load the page we want to analyze
    $this->clickAndWait('Some Page')
    // Use getEval API to retrieve the metrics we recorded
    $metrics = $this->getEval('window.Perf');
    // Call our internal method that will store the metrics for later use
    // Note: we include a reference to the page or to what use case we are testing
    $this->saveMetrics('some-page', $metrics);
  }
}

我们将此模式用于应用程序中的多个用例。另请注意,虽然我使用了完整页面加载的示例,但我们的框架还支持收集 AJAX 交互的指标,我们经常这样做(例如远程加载由用户点击触发的内容)。

We use this pattern for multiple use cases in our application. Also note that while I used the example of a full page load, our framework also supports collecting metrics for AJAX interactions, which we do quite a lot (for instance remotely loading content triggered by a user click).

使用 Selenium 的一大好处是支持多种浏览器。我们有一组运行各种版本的 Internet Explorer 和 Firefox 的虚拟机。这使得我们的性能测试套件能够跨多个平台运行。

One of the great things about using Selenium is multiple browser support. We have a set of virtual machines running various versions of Internet Explorer and Firefox. This enables our performance test suite to run across multiple platforms.

最后一个难题是分析我们收集的数据。为此,我们构建了一个小型数据库驱动的应用程序,用于读取我们收集的指标并绘制它们。我们可以应用过滤器,例如特定的浏览器供应商或版本、特定的用例、我们软件的特定版本等。然后我们可以查看一段时间内的完整数据。

The last piece of the puzzle is analyzing the data we collected. For this purpose, we built a small database-driven application that reads the metrics we collected and plots them. We can apply filters such as specific browser vendor or version, specific use case, specific version of our software, etc. We can then look at the complete data over time.

图 26-1显示了我们用来绘制收集到的数据的逻辑。

Figure 26-1 shows the logic we use to plot the data we collected.

网络请求次数

图 26-1。网络请求次数

Figure 26-1. Web request times

结果示例

Sample Results

图26-2是收集数据后生成的图表示例。

Figure 26-2 is an example of chart generated after collecting data.

网络计时示例

图 26-2。网络计时示例

Figure 26-2. Web timings sample

在上面的示例中,我们可以观察到示例 1 中存在客户端性能问题、示例 2 中后端 Web 服务中的一些低效代码以及示例 3 中缓慢的数据库查询。

In the above sample, we can observe a client-side performance issue in Sample 1, some inefficient code in the backend web services in Sample 2 and a slow database query in Sample 3.

好处

Benefits

当我们在 2009 年构建这个框架时,我们心中有多个目标:

When we built this framework in 2009, we had multiple goals in mind:

  • 监控我们的软件版本之间的性能并捕获最终的回归

  • Monitor performance between our software release and catch eventual regressions

  • 监控即将推出的功能的性能

  • Monitor performance of upcoming features

  • 随着我们添加更多用户/更多数据,监控软件的可扩展性

  • Monitor the scalability of the software as we add more users/more data

回顾过去,这个工具取得了一些很好的成果,以下是一些示例:

Looking back, this tool yielded some great results and here are a few examples:

  • 在我们的 javascript 代码中发现错误,这些错误会导致 IE 中的加载时间显着增加

  • Discovery of bugs in our javascript code that would result in much higher load times in IE

  • 发现我们使用 JavaScript 操作 HTML 的方式存在问题,并能够提高受影响的用户交互的响应能力

  • Found issues in the way we were manipulating HTML with javascript and were able to improve the responsiveness of the impacted user interactions

  • 随着数据量的增加,我们的后端 Web 服务中的瓶颈被消除:我们能够准确地查明问题所在(低效的后端代码、缓慢的数据库查询等)

  • Eliminated bottlenecks in our backend web services as we raised the amount of data: we were able to pinpoint exactly where the problem was (inefficient backend code, slow database queries, etc.)

结束语

Closing Words

总之,我想研究一下我们想要改进我们的设置的一些想法。

In conclusion, I’d like to look into some ideas we have in mind to improve our setup.

我想更频繁地使用该工具。目前,我们在开发过程中和每次发布之前多次运行测试套件,但这是一个手动过程。如果能将测试套件与我们的 Jenkins CI 构建结合起来,那就太棒了。一个不同的想法是将该工具作为我们产品的一部分提供并在生产中运行它,为我们提供一些关于我们平台的实际使用情况的分析。

I’d like to use the tool more often. We currently run the test suite several times during our development process and before each releases, but this is a manual process. It would be great to tie in the test suite with our Jenkins CI builds. A different idea would be to ship the tool as part of our product and run it in production, providing us with some analytics on real world usage of our platform.

正如我提到的,我们使用虚拟机在多个平台上进行测试。这增加了一些维护方面的开销。也许我们应该看看Sauce 实验室托管的 Selenium 解决方案?

As I mentioned, we are using virtual machines to test on multiple platforms. This adds a bit of overhead in terms of maintenance. Maybe we should look at the hosted Selenium solution from Sauce labs?

当我们构建产品时,性能环境有点不同,今天有一些当时无法使用的工具。如果我们利用WebPageTestboomerang等,我们会看到任何好处吗?

When we built the product, the performance landscape was a bit different and there are tools today that were not available back then. Would we see any benefits if we were to leverage WebPageTest, boomerang, etc.?

制作人员

Credits

我要感谢 Bill Scott在 Netflix 上关于 RUM 的演讲,这启发了我们构建我们的框架。

I’d like to acknowledge Bill Scott for his presentation on RUM at Netflix, which inspired us to build our framework.

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/performance-testing-with-selenium-and-javascript/。最初发布于 2011 年 12 月 26 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/performance-testing-with-selenium-and-javascript/. Originally published on Dec 26, 2011.

第 27 章衡量网站性能的简单方法

Chapter 27. A Simple Way to Measure Website Performance

帕维尔 ·保劳

Pavel Paulau

不久前,Neustar 的人员在Velocity Conference上演示了仅使用免费开源解决方案进行有效客户端性能测试的可能性。他们引入了一系列工具,例如SeleniumBrowserMob Proxy。第一个旨在自动模拟用户交互,第二个则适用于指标捕获。那真是鼓舞人心的演讲。

Not so long ago, folks from Neustar demonstrated at Velocity Conference the possibility of effective client-side performance testing using only free, open-source solutions. They introduced bundle of tools, such as Selenium and BrowserMob Proxy. The first one is intended to automate emulation of user interactions, the second one is a good for metric capturing. That was really inspiring presentation.

他们的方法的最大特点是所有性能数据都整合到一个容器中 - HTTP Archive ( HAR )。由于严格的格式标准化,它使得测试结果的进一步处理更加可控和可预测。

The greatest feature of their approach was the fact that all performance data are consolidated into a single container—HTTP Archive (HAR). It makes further processing of test results more controlled and predictable due to strict format standardization.

然而,当时还没有处理 HAR 文件的高级工具。HAR Viewer 很棒,但不适合常见的测试工作流程。相反,ShowSlow 是自动性能测量存储库的完美示例。不幸的是,HAR 文件的处理并不是它的最强特性。于是一个新的项目HAR Storage(http://code.google.com/p/harstorage/)出现了。

However, there were no advanced tools for dealing with HAR files at that moment. HAR Viewer is wonderful but not suitable for common testing workflow. ShowSlow is instead a perfect example of a repository for automated performance measurement. Unfortunately, handling of HAR files is not the strongest trait of it. So a new project HAR Storage (http://code.google.com/p/harstorage/) appeared.

概念

Concept

测试过程相当简单。您所需要的只是创建一个描述常见用户操作的 Selenium 脚本。然后,您可以使用方法来武装您的脚本,以通过其 API 控制代理服务器。它不仅意味着捕获和存储HTTP请求流,还意味着网络特性(例如带宽和延迟)和流量过滤的定制。最后一点对于分析第三方组件对整个站点性能的影响极其重要。

The testing process is rather straightforward. All you need is to create a Selenium script that describes common user actions. Then you arm your script with methods to control a proxy server via its API. It not only means capturing and storing streams of HTTP requests, but also customization of network characteristics (e.g., bandwidth and latency) and traffic filtering. The last point is extremely important for analysis of the impact of third-party components on overall site performance.

最后您可以将每个页面或异步事件的HAR发送到本地存储库——HAR存储。实际上,HAR Storage ( http://harstorage.com/ ) 是一个基于 Pylons 和 MongoDB 构建的简单 Web 应用程序。它允许从 HAR 文件中提取详细指标、存储测试结果以及可视化所有收集的数据。

Finally you can send HAR of each page or asynchronous event to local repository—HAR Storage. Actually, HAR Storage (http://harstorage.com/) is a simple web application built on Pylons and MongoDB. It allows extracting detailed metrics from HAR files, storing test results, and visualizing all gathered data.

优点

Advantages

主要优点是高度灵活性。使用 BrowserMob 代理,您可以在任何支持自定义代理设置的现代浏览器中测试网站。您甚至可以处理移动浏览器。

The key advantage is high flexibility. With BrowserMob Proxy, you can test a website in any modern browser that supports custom proxy settings. You can even deal with mobile browsers.

Selenium 又使得模拟任何复杂的用户场景成为可能。因此,您既可以分析单页面的速度,也可以分析复杂业务事务的性能。

Selenium in turn makes it possible to simulate any sophisticated user scenario. Therefore you can analyze both the speed of single page and the performance of complex business transactions.

HAR 存储也有很酷的功能。例如,您可以比较不同测试的结果。这对于分析第三方内容或调查站点速度与网络质量之间的关系有很大帮助(图27-1)。

HAR Storage has cool features too. For instance, you can compare results of different tests. This is a great help for analyzing third-party party content or for investigating the relationship between site speed and network quality (Figure 27-1).

性能趋势

图 27-1。性能趋势

Figure 27-1. Performance Trends

至少使用 HAR 存储,您可以在任何开发阶段持续跟踪网站或应用程序的性能。

At least with HAR Storage you can continuously track the performance of your website or application at any development phase.

局限性

Limitation

这个世界上没有什么是完美的。BrowserMob 代理在浏览器外部运行,一方面对其性能影响最小;另一方面,另一方面,内部浏览器事件是不可访问的。因此,您无法估计渲染或 JavaScript 解析的性能。像dynaTrace AJAX Edition这样的工具 更适合此类任务。

Nothing is perfect in this world. BrowserMob proxy runs outside the browser and on the one hand has minimal impact on its performance; on the other hand, internal browser events are inaccessible. Thus you can’t estimate performance of rendering or JavaScript parsing. Tools like dynaTrace AJAX Edition are more suitable for such tasks.

对于某些人来说,这种方法可能看起来太复杂了。事实上并非如此。WebPagetest.org 让您只需输入 URL 即可享受结果。但是,如果您需要真正的跨浏览器测试、随时间的测量以及复杂用例的实现,那么此方法将适合您。

This approach may seem too complicated to some people. In fact it isn’t. WebPagetest.org lets you simply put in the URL and enjoy the result. But if you need real cross-browser testing, measurements over time, and implementation of complex use cases—this method will work for you.

结论

Conclusion

Web 性能仍然是关键方面,性能测试仍然是一个挑战。基于 Selenium、BrowserMob Proxy 和 HAR Storage 的框架可能会成为许多不断发展的项目的最终解决方案。

Web performance is still critical aspect, and performance testing is still a challenge. Frameworks based on Selenium, BrowserMob Proxy, and HAR Storage may become an ultimate solution for many growing projects.

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/a-simple-way-to-measure-website-performance/。最初发布于 2011 年 12 月 27 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/a-simple-way-to-measure-website-performance/. Originally published on Dec 27, 2011.

第 28 章超越带宽:UI 性能

Chapter 28. Beyond Bandwidth: UI Performance

大卫 ·卡尔霍恩

David Calhoun

介绍

Introduction

传统上,较早的性能研究关注的是加速服务器端的速度,但几年前,Steve Souders 开始研究主要性能瓶颈发生在客户端的想法。特别是从服务器将字节推送到客户端的方式。“减少 HTTP 请求”已成为提高前端性能的普遍格言,而这一问题在当今的移动浏览器世界中更为重要(通常在比宽带连接慢一个数量级的网络上运行)。

Traditionally, older performance studies were concerned with speeding up things on the server side, but a few years back, Steve Souders famously started research on the idea that the main performance bottleneck happened on the client side. In particular, in the way bytes were pushed to the client from the server. “Reduce HTTP requests” has become a general maxim for speeding up frontend performance, and that is a concern that’s even more relevant in today’s world of mobile browsers (often running on networks that are an order of magnitude slower than broadband connections).

这些研究关注的是延迟和带宽,这仍然是当今性能研究的重点。您可能已经熟悉标准 HTTP 瀑布图(图 28-1)。

These studies have been concerned with latency and bandwidth, and this still continues to be the focus of performance research today. You are probably already familiar with the standard HTTP waterfall chart (Figure 28-1).

HTTP瀑布图

图 28-1。HTTP瀑布图

Figure 28-1. HTTP waterfall chart

然而,我们慢慢开始看到前端堆栈的每个组件(HTML/CSS/JS)转向其他前端问题。特别是,人们非常关注 JavaScript 性能,jsPerf ( http://jsperf.com/ )的流行和 JavaScript 分析器的兴起证明了这一点。

However, we’re slowly starting to see a shift to other frontend concerns for each component of the frontend stack (HTML/CSS/JS). In particular, there’s been a great focus on JavaScript performance, a fact attested to by the popularity of jsPerf (http://jsperf.com/) and the rise of JavaScript profilers.

页面加载后:UI 层

After the Page Loads: The UI Layer

这一切都很好,但我们缺少同样重要的东西:表示(UI)层。尽管一些 UI 性能技巧已在整个社区传播多年,但它们通常被放在一边,而带宽和延迟问题更多地处于研究的前沿。例如,CSS 甚至是一个问题,重点是减少 CSS 文件大小(http://www.stevesouders.com/blog/2010/07/03/velocity-top-5-mistakes-of-massive-css/)。但是昂贵的 CSS 选择器呢?或者 CSS 可能会导致页面在用户滚动时严重滞后?

This is all well and good, but we're missing something equally important: the presentation (UI) layer. Although some UI performance tips have been disseminated throughout the community for years, they are often as an aside, with bandwidth and latency concerns much more at the forefront of research. For instance, where CSS is even a concern, the focus is on reducing CSS filesize (http://www.stevesouders.com/blog/2010/07/03/velocity-top-5-mistakes-of-massive-css/). But what about expensive CSS selectors? Or CSS that may cause the page to lag horribly as the user scrolls?

UI 性能被低估的原因之一可能是因为它无法量化。作为工程师,如果说经过数小时的改进,网站“感觉”更灵敏,或者滚动更流畅,这有点令人不安。如果没有某种指标,就很难确定渲染瓶颈在哪里,或者即使我们在尝试消除瓶颈时取得了进展。

One of the reasons UI performance has been downplayed is perhaps because of its inability to be quantified. As engineers, it's a bit disconcerting to say that as a result of many hours of improvements, a website “feels” more responsive, or scrolls more smoothly. Without some sort of metrics, it's difficult to determine where the rendering bottlenecks are, or even if we're making progress when trying to smooth them out.

用户界面分析器

UI Profilers

幸运的是,我们现在才刚刚开始使用可以测量这些 UI 瓶颈的工具。“回流”和“重画”现在不仅仅是抽象的神秘事件——它们现在是我们可以在图表上指出的东西。

Luckily we're just now beginning to get access to tools that let us measure these UI bottlenecks. “Reflows” and “repaints” are now more than abstract mysterious happenings—they are now something we can point to on a chart.

在撰写本文时,CSS 分析器可在 Chrome 的开发人员工具以及 Opera 的调试器 (Dragonfly) 中使用。图 28-2显示了性能分析的新面貌。

At the time of writing, CSS profilers are available in Chrome's Developer Tools, as well as Opera's debugger (Dragonfly). Figure 28-2 shows the new face of performance profiling.

Opera 分析器

图 28-2。Opera 分析器

Figure 28-2. Opera profiler

除了使用这些新的分析器定位昂贵的 CSS 选择器之外,我们还可以使用一些更有用的 UI 性能调试工具。以下只是其中的一些。

Other than targeting expensive CSS selectors with these new profilers, we also have access to a few more useful tools for UI performance debugging. The following is just a few of these.

CSS压力测试

CSS Stress Test

CSS Stress Test(作者:Andy Edinborough)是一个小书签,它通过有选择地删除每个 CSS 声明来找出哪些 CSS 声明会减慢页面速度,然后对滚动速度性能进行计时。结果是一个看起来有点刺耳的小书签,但在追踪流氓 CSS 瓶颈方面似乎非常有用。自我提醒:显然,从性能角度来看,将 border-radius 应用于大量元素并不是一个好主意。

CSS Stress Test (by Andy Edinborough) is a bookmarklet that figures out which CSS declarations are slowing down the page by selectively removing each one, then subsequently timing the scroll speed performance. The result is a bookmarklet that's a bit jarring to watch, but seems quite useful in tracking down rogue CSS bottlenecks. Note to self: apparently applying border-radius to a ton of elements isn't a very good idea, performance-wise.

CSS 分析器

CSS Profilers

CSS分析器即将出现在您附近的浏览器中,这将使我们更深入地了解我们编写的 CSS 的实际速度,使我们摆脱模糊和神秘的规则。通用选择器(*)真的那么贵吗?border-radius、box Shadow 和 rgba 值真的会消耗性能吗?现在我们有方法来衡量这些担忧!

A CSS profiler is coming to a browser near you, which will give us much more insight into the actual speed of the CSS we write, moving us forward from vague and mysterious rules. Is the universal selector (*) really that expensive? Are border-radius, box shadow, and rgba values really performance drains? Now we have ways to measure those concerns!

CSS 棉绒

CSS Lint

CSS Lint(由 Nicole Sullivan 和 Nicholas Zakas 编写)是一组最佳实践(https://github.com/stubbornella/csslint/wiki/Rules)(您可能不同意所有这些,但这没关系),包括一些专门针对 UI 性能的有用规则。运行你的样式表,它会给你一些关于到底要改进什么的有用提示。

CSS Lint (by Nicole Sullivan and Nicholas Zakas) is a set of best practices (https://github.com/stubbornella/csslint/wiki/Rules) (you may not agree with them all, but that's OK), including a few helpful rules that target UI performance specifically. Run your stylesheets through and it'll give you some helpful tips on what exactly to improve.

DOM 怪物

DOM Monster

DOM Monster (由 Amy Hoy 和 Thomas Fuchs 开发)旨在作为 JavaScript 分析器伴侣,但请记住 DOM(文档对象模型)的复杂性也会影响 UI 重绘和回流。减少膨胀对于在线数据以及 UI 渲染和 JavaScript DOM 访问来说都是更好的选择。

DOM Monster (by Amy Hoy and Thomas Fuchs) is intended as a JavaScript profiler companion, but remember that the complexity of the DOM (Document Object Model) will also affect UI repaints and reflows. Reducing that bloat is better for data down the wire, as well as for both UI rendering and JavaScript DOM access.

速度感知

Perception of Speed

如果您考虑一下,所有性能都与用户如何感知性能有关。虽然我们最关心的是真正的性能改进,但我们必须认识到局限性,并意识到我们并不总是能够控制带宽、延迟或用户浏览器的速度。我们在其他地方已经尽力了,但在这里我们有时不得不假装做到这一点。“假装它,直到你成功!”

If you think about it, all of performance is concerned with how performance is perceived by the user. While we're mostly concerned with real performance improvements, we have to recognize the limitations and realize that we don't always have control over bandwidth, latency, or the speed of a user's browser. Where we've already done our best elsewhere, here we sometimes have to fake it. “Fake it 'till you make it!”

我说的假装是什么意思?在一种情况下,这可能意味着在可能的情况下预加载内容,这就是 Gmail 移动版在用户单击“显示更多消息...”按钮之前所做的事情。用户点击后,内容实际上已经加载完毕。这只是一个 UI 花招来显示更新的新内容,而且这个过程发生得非常快。发出原始 HTTP 请求需要多长时间并不重要,因为无论哪种方式,用户的体验都是相同的,他们的感觉是界面非常快。这只是良好的用户体验设计与良好的工程完美结合的一个例子。

What do I mean by faking it? In one circumstance this might mean preloading content where possible, which is what Gmail mobile does before the user clicks on the “Show more messages…” button. After the user clicks, the content has actually already been loaded. It's just a UI sleight-of-hand to show the updated new content, and this happens extremely fast. It doesn't really matter how long it took to make the original HTTP request, because either way the experience is the same for the user, and their perception is that the interface is extremely fast. This is just one example of a great marriage of good user experience design with good engineering.

“假装”也可能意味着简单地做出响应并在用户采取行动后快速向用户显示视觉指示符。无论您如何优化 HTTP 请求或连接速度有多快,如果您在用户执行操作后没有给出指示,他们可能会重复他们的操作(在触摸屏上单击或再次点击)并且离开时只留下了对缓慢界面的痛苦记忆。

“Faking it” might also mean simply being responsive and quickly showing the user a visual indicator after they take an action. It doesn't matter how well you optimize HTTP requests or how fast the connection is—if you don't give an indication after the user performs an action, they will likely repeat their action (a click or another tap on the touchscreen) and come away with just a bitter memory of a sluggish interface.

另一个巧妙技术的例子是 Flickr,他们将架构从 YUI 2 转移到 YUI 3(请参阅 Ross Harmes 的讨论:http://www.youtube.com/watch ?v=05C0GQPKA4g )。尽管 Flickr 团队利用了组合 HTTP 请求的优势,但初始加载的延迟意味着用户可能会在 JavaScript 完全加载、解析和执行之前开始采取操作。由于 Flickr 逐步增强其网页,这意味着如果没有可用的 JavaScript,用户就会进入为禁用 JavaScript 的用户设计的后备页面。这正是这些快速用户最终的结局,因为他们在 JavaScript 有机会覆盖这些用于后备的 URL 之前就采取了行动。

Another example of a clever technique here is Flickr, after they moved their architecture over from YUI 2 to YUI 3 (see Ross Harmes talk about it here: http://www.youtube.com/watch?v=05C0GQPKA4g). Though the Flickr team took advantage of combining HTTP requests, the delay of the initial load meant that a user might start taking actions before the JavaScript was fully loaded, parsed, and executed. Because Flickr progressively enhances their webpages, this means that without JavaScript available, the user gets taken to fallback pages intended for users with JavaScript disabled. And this is precisely where these quick users ended up, because they had taken actions before JavaScript had a chance to override these URLs intended for fallbacks.

他们的解决方案是在页面中加载一个迷你库来捕获页面上的所有事件并将它们排队以供稍后重播。最重要的是,这个小型库还提供了一个 UI(加载旋转器),以便在采取操作后向用户提供反馈,即使这意味着什么都没有发生,但事件会排队等候稍后在 JavaScript 准备就绪时重播。我们再次看到,有时假装它很重要,直到你成功为止!

Their solution was to load a mini-library in the page to capture all events on the page and queue them back to be replayed later. Most importantly, this small library also provides a UI (a loading spinner) to give the user feedback after taking actions, even if it means nothing had happened, short of the event being queued up to be replayed later when the JavaScript is ready. Again, we see that sometimes it's just important to fake it ’til you make it!

花絮

Tidbits

正如我之前提到的,UI 性能技巧已经流传了很长一段时间,但与延迟和带宽问题相比,它们在某种程度上被低估了。

As I mentioned before, UI performance tips have been circulating for quite a while, but they have been somewhat downplayed compared to latency and bandwidth issues.

以下是一些花絮,让您了解一些存在的问题:

Here’s a collection of tidbits to give you an idea of some of the concerns that are out there:

呼吁关注 UI 性能

Call for a Focus on UI Performance

性能不仅仅是将字节越过栅栏推入浏览器!用户的大部分体验发生在页面加载之后,因此我们仍然应该关注“加载页面”体验的性能。这适用于我们的 JavaScript,但同样重要的是我们的 CSS 及其对滚动速度和整体 UI 响应能力的影响。

Performance is more than pushing bytes over a fence into a browser! Much of the user’s experience happens after a page loads, so we should still be concerned about the performance of a “loaded page” experience. This applies to our JavaScript, but equally as important is our CSS and its impact on scroll speed and overall UI responsiveness.

这可能意味着我们有时使用图像而不是尚未准备好迎接黄金时段的新 CSS 花哨的性能会更好,这取决于我们权衡成本并了解权衡!它还可以帮助我们欣赏新的 CSS 功能或精美的演示,同时对其实际用途保持怀疑。

This might mean that we are sometimes better off performance-wise using images instead of new CSS fanciness that’s not yet ready for primetime, and it’s up to us to weigh the cost and understand the tradeoff! It also helps us appreciate new CSS features or fancy demos while remaining skeptical of their practical use.

最重要的是,如果您与 UI 性能问题作斗争并克服了它,世界可以从您的经验中学习!当你在博客上谈论它时,你可以节省其他人一些时间——这些时间可以花在与家人在一起的时间上,这绝对是更重要的。我们现在需要的是MarcelEstelle等人发表的更多文章 ,他们了解性能不仅仅是节省字节。

More than anything, if you struggled with a UI performance issue and overcame it, the world could learn from your experience! When you blog about it, you save other folks some time—time that could be spending with their families, which is definitely more important. What we need now is more articles from folks like Marcel and Estelle who understand that performance goes beyond simply saving bytes.

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/beyond-bandwidth-ui-performance/。最初发布于 2011 年 12 月 28 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/beyond-bandwidth-ui-performance/. Originally published on Dec 28, 2011.

第 29 章 CSS 选择器性能发生了变化!(为了更好)

Chapter 29. CSS Selector Performance Has Changed! (For the Better)

妮可 沙利文

Nicole Sullivan

伟大的文章,例如 Dave Hyatt 的“编写高效 CSS”,帮助开发人员适应基本的选择器匹配环境。我们从 Steve Souders(和其他人)那里了解到,选择器从右到左匹配,并且某些选择器特别难以匹配,最好避免。例如,我们被告知后代选择器很慢,尤其是当最右边的选择器与页面上的许多元素匹配时。当我们没有这些信息时,所有这些都是很棒的信息,但事实证明,时代已经变了。感谢 Antti Koivisto 的一些出色工作,我们不再需要担心许多选择器。

Great articles, like Dave Hyatt’s “Writing Efficient CSS”, helped developers adapt to a rudimentary selector matching landscape. We learned from Steve Souders (and others) that selectors match from right to left, and that certain selectors were particularly arduous to match and should best be avoided. For example, we were told that descendant selectors were slow, especially when the right-most selector matched many elements on the page. All this was fantastic information when we had none, but as it turns out, times have changed. Thanks to some amazing work by Antti Koivisto there are many selectors we don’t need to worry about anymore.

Antti Koivisto 为 WebKit 核心贡献了代码,最近花了一些时间优化 CSS 选择器匹配。事实上,他在完成工作后说道:

Antti Koivisto contributes code to WebKit core and recently spent some time optimizing CSS selector matching. In fact, after finishing his work, he said:

我的观点是,作者不需要担心优化选择器(据我所知,他们通常不需要),这应该是引擎的工作。

My view is that authors should not need to worry about optimizing selectors (and from what I see, they generally don’t), that should be the job of the engine.

哇!这对我来说听起来棒极了。我希望能够以对我的架构有意义的方式使用选择器,并让渲染引擎处理选择器优化。他做了什么?不仅仅是一件事,而是他创建了多个级别的优化 - 我们将特别关注四个优化:

Wow! That sounds fantastic to me. I’d love to be able to use selectors in a way that makes sense for my architecture and let the rendering engine handle selector optimization. So, what did he do? Not just one thing, rather he created multiple levels of optimization—we’ll take a look at four optimizations in particular:

  • 风格分享

  • Style sharing

  • 规则哈希

  • Rule hashes

  • 祖先过滤器

  • Ancestor filters

  • 快速路径

  • Fast path

风格分享

Style Sharing

样式共享允许浏览器找出样式树中的一个元素与其已经找出的元素具有相同的样式。为什么同样的计算要进行两次?

Style sharing allows the browser to figure out that one element in the style tree has the same styles as something it has already figured out. Why do the same calculation twice?

例如:

For example:

<div>
  <p></p>
  <p>酒吧</p>
</div>
<div>
  <p>foo</p>
  <p>bar</p>
</div>

如果浏览器引擎已经计算了第一段的样式,则不需要为第二段再次计算样式。这是一个简单但巧妙的更改,为浏览器节省了大量工作。

If the browser engine has already calculated the styles for the first paragraph, it doesn’t need to do so again for the second paragraph. A simple but clever change that saves the browser a lot of work.

规则哈希

Rule Hashes

到现在为止,我们都知道浏览器从右到左匹配样式,因此最右边的选择器非常重要。规则哈希根据最右边的选择器将样式表分成组。例如,以下样式表将分为三组(表 29-1)。

By now, we all know that the browser matches styles from right to left, so the rightmost selector is really important. Rule hashes break a stylesheet into groups based on the rightmost selector. For example the following stylesheet would be broken into three groups (Table 29-1).

a {}
div p {}
div p.legal {}
#sidebar a {}
#sidebar p {}
a {}
div p {}
div p.legal {}
#sidebar a {}
#sidebar p {}

表 29-1。选择器组

Table 29-1. Selector groups

app.legal

a {}

a {}

div p {}

div p {}

div p.legal {}

div p.legal {}

#sidebar a {}

#sidebar a {}

#sidebar p {}

#sidebar p {}

当浏览器使用规则哈希时,它不必查看整个样式表中的每个选择器,而是查看实际上有机会匹配的小得多的选择器组。另一个简单但非常聪明的更改,消除了页面上每个 HTML 元素的不必要的工作!

When the browser uses rule hashes, it doesn’t have to look through every single selector in the entire stylesheet, but through a much smaller group of selectors that actually have a chance of matching. Another simple but very clever change that eliminates unnecessary work for every single HTML element on the page!

祖先过滤器

Ancestor Filters

祖先过滤器有点复杂。它们是 概率过滤器,用于计算选择器匹配的可能性。因此,当相关元素没有所需的匹配祖先时,祖先过滤器可以快速消除规则。在本例中,它测试后代和子选择器,并根据类、id 和标签进行匹配。特别是后代选择器以前被认为非常慢,因为渲染引擎需要循环遍历每个祖先节点来测试匹配。布隆过滤器来救援。

The ancestor filters are a bit more complex. They are Probability filters which calculate the likelihood that a selector will match. For that reason, the ancestor filter can quickly eliminate rules when the element in question doesn’t have required matching ancestors. In this case, it tests for descendant and child selectors and matches based on class, id, and tag. Descendant selectors in particular were previously considered to be quite slow because the rendering engine needed to loop through each ancestor node to test for a match. The bloom filter to the rescue.

布隆过滤器是一种数据结构,可让您测试特定选择器是否是集合的成员。听起来很像选择器匹配,对吧?布隆过滤器测试 CSS 规则是否是与您当前正在测试的元素匹配的规则集的成员。布隆过滤器最酷的一点是,误报是可能的,但误报则不然。这意味着,如果布隆过滤器表示选择器与当前元素不匹配,则浏览器可以停止查找并移动到下一个选择器。节省大量时间!另一方面,如果布隆过滤器表明当前选择器匹配,则浏览器可以继续使用正常的匹配方法以 100% 确定它是匹配的。较大的样式表会有更多的误报,

A bloom filter is a data structure which lets you test if a particular selector is a member of a set. Sounds a lot like selector matching, right? The bloom filter tests whether a CSS rule is a member of the set of rules that match the element you are currently testing. The cool thing about the bloom filter is that false positives are possible, but false negatives are not. That means that if the bloom filter says a selector doesn’t match the current element, the browser can stop looking and move on the the next selector. A huge time saver! On the other hand, if the bloom filter says the current selector matches, the browser can continue with normal matching methods to be 100% certain it is a match. Larger stylesheets will have more false positives, so keeping your stylesheets reasonably lean is a good idea.

祖先过滤器使得匹配后代和子选择器变得非常快。它还可用于将速度较慢的选择器范围限制为最小子树,因此浏览器很少需要处理效率较低的选择器。

The ancestor filter makes matching descendant and child selectors very fast. It can also be used to scope otherwise slow selectors to a minimal subtree so the browser only rarely needs to handle less efficient selectors.

快速通道

Fast Path

快速路径使用非递归、完全内联循环重新实现更通用的匹配逻辑。它用于匹配具有以下任意组合的选择器:

Fast path re-implements more general matching logic using a non-recursive, fully inlined loop. It is used to match selectors that have any combination of:

  • 后代选择器、子选择器和子选择器组合器

  • Descendant, child, and sub-selector combinators

  • 标签、ID、类和属性组件选择器

  • Tag, ID, class, and attribute component selectors

快速路径提高了如此大的组合器和选择器子集的性能。事实上,他们发现总体提高了 25%,其中后代选择器和子选择器提高了两倍。作为一个优点,除了样式匹配之外,还为 querySelectorAll 实现了这一点。

Fast Path improved performance across such a large subset of combinators and selectors. In fact, they saw a 25% improvement overall with a two times improvement for descendant and child selectors. As a plus, this has been implemented for querySelectorAll in addition to style matching.

既然很多事情都已经改善了,那还有什么进展缓慢呢?

If so many things have improved, what’s still slow?

怎么还慢?

What Is It Still Slow?

根据 Antti 的说法,直接和间接相邻组合器仍然可能很慢,但是,祖先过滤器和规则哈希可以降低影响,因为这些选择器很少会匹配。他还表示,webkit 仍有很大的空间来优化伪类和元素,但无论如何,它们比尝试使用 JavaScript 和 DOM 操作做同样的事情要快得多。事实上,尽管仍有改进的空间,安蒂说:

According to Antti, direct and indirect adjacent combinators can still be slow, however, ancestor filters and rule hashes can lower the impact as those selectors will only rarely be matched. He also says that there is still a lot of room for webkit to optimize pseudo classes and elements, but regardless they are much faster than trying to do the same thing with JavaScript and DOM manipulations. In fact, though there is still room for improvement, Antti says:

从风格匹配的角度来看,适度使用几乎所有东西都会表现得很好。

Used in moderation pretty much everything will perform just fine from the style matching perspective.

我喜欢那个的声音。结论是,如果我们能够 保持样式表大小合理,并且合理地使用我们的选择器,我们就不需要扭曲自己来匹配昨天的浏览器环境。太棒了,安蒂!

I like the sound of that. The take-away is that if we can keep stylesheet size sane, and be reasonable with our selectors, we don’t need to contort ourselves to match yesterday’s browser landscape. Bravo, Antti!

想了解更多吗?查看 Paul Irish 关于 CSS 性能的演示 ( http://dl.dropbox.com/u/39519/talks/cssperf/index.html )。

Want to learn more? Check out Paul Irish’s presentation on CSS performance (http://dl.dropbox.com/u/39519/talks/cssperf/index.html).

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/css-selector-performance-has-changed-for-the-better/。最初发布于 2011 年 12 月 29 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/css-selector-performance-has-changed-for-the-better/. Originally published on Dec 29, 2011.

第 30 章:PhantomJS 和 recognize.js 让你失去理智

Chapter 30. Losing Your Head with PhantomJS and confess.js

詹姆斯 ·皮尔斯

James Pearce

我们渴望有强大而可靠的方法来判断 Web 应用程序的性能和用户体验。但多年来,我们不得不依靠各种近似技术来做到这一点:协议级综合和测量、古怪的浏览器自动化、脆弱的事件脚本——所有这些都伴随着一种预感,即我们仍然没有完全捕捉 到真实用户使用真实浏览器的行为。

We yearn for powerful and reliable ways to judge the performance and user experience of web applications. But for many years, we’ve had to rely on a variety of approximate techniques to do so: protocol-level synthesis and measurement, cranky browser automation, fragile event scripting—all accompanied with a hunch that we’re still not quite capturing the behavior of real users using real browsers.

今年最有趣的开源项目之一:PhantomJS ( http://phantomjs.org/ )。感谢 Ariya Hidayat ( http://ariya.ofilabs.com/ ),每个 Web 开发人员的工具箱中都有一个有价值的新工具,它提供了一个无头但功能齐全的 WebKit 浏览器,可以轻松地从命令行启动,并且然后使用 JavaScript 编写脚本并进行操作。

Enter one of this year’s most interesting open source projects: PhantomJS (http://phantomjs.org/). Thanks to Ariya Hidayat (http://ariya.ofilabs.com/), there’s a valuable new tool for every web developer’s toolbox, providing a headless, yet fully-featured, WebKit browser that can easily be launched off the command line, and then scripted and manipulated with JavaScript.

我使用 PhantomJS 来支持 recognize.js ( https://github.com/jamesgpearce/confess ),这是一个小型库,可以轻松分析用于各种目的的网页和应用程序。它目前有两个主要功能:提供简单的页面性能配置文件,以及生成应用程序缓存清单。让我们快速浏览一下它们。

I’ve used PhantomJS to underpin confess.js (https://github.com/jamesgpearce/confess), a small library that makes it easy to analyze web pages and apps for various purposes. It currently has two main functions: to provide simple page performance profiles, and to generate app cache manifests. Let’s take them for a quick spin.

绩效总结

Performance Summaries

安装后,使用 recognize.js 做的最简单的事情就是生成给定页面的简单性能配置文件。使用 PhantomJS 浏览器,加载 URL、获取计时并发出摘要输出 — 所有这些都只需一个命令:

Once installed, the simplest thing to do with confess.js is generate a simple performance profile of a given page. Using the PhantomJS browser, the URL is loaded, its timings taken, and a summary output emitted—all with one single command:

$> phantomjs recognize.js http://calendar.perfplanet.com/2011/ 性能
$> phantomjs confess.js http://calendar.perfplanet.com/2011/ performance

在这里,Confession.js 脚本与 PhantomJS 二进制文件一起启动,定向到 PerfPlanet 博客页面,然后预期生成如下内容:

Here, the confess.js script is launched with the PhantomJS binary, directed to go to the PerfPlanet blog page, and then expected to generate something like the following:

加载时间:6199ms
   资源数量:30

 最快资源:408ms;http://calendar.perfplanet.com/wp-content/themes/wpc/style.css
 最慢资源:3399ms;http://calendar.perfplanet.com/photos/joshua-70tr.jpg
  总资源:69080ms

最小资源:2061b;http://calendar.perfplanet.com/wp-content/themes/wpc/style.css
 最大资源:8744b;http://calendar.perfplanet.com/photos/joshua-70tr.jpg
  总资源:112661b;(至少)
Elapsed load time:   6199ms
   # of resources:       30

 Fastest resource:    408ms; http://calendar.perfplanet.com/wp-content/themes/wpc/style.css
 Slowest resource:   3399ms; http://calendar.perfplanet.com/photos/joshua-70tr.jpg
  Total resources:  69080ms

Smallest resource:    2061b; http://calendar.perfplanet.com/wp-content/themes/wpc/style.css
 Largest resource:    8744b; http://calendar.perfplanet.com/photos/joshua-70tr.jpg
  Total resources:  112661b; (at least)

这个简单的输出并没有什么革命性的——当然,在幕后,它来自一个真正的 WebKit 浏览器。我们可以对浏览器发出和接收的每个请求和响应进行可靠的脚本访问,而无需对测试页面进行任何更改。

Nothing revolutionary about this simple output—apart from the fact that of course, under the cover, this is coming from a real WebKit browser. We’re getting solid scriptable access to every request and response that the browser is making and receiving, without having to make any changes to the page under test.

因此,您可能已经可以想象使用此仪器可以完成更多工作。例如,我有一些轻松的乐趣,让 recognize.js(带有详细标志)发出页面及其资源的瀑布图 - 全部采用彩色 ASCII 艺术:

So already you might be able to imagine there’s a lot more that can be done with this instrumentation. I had some lighthearted fun getting confess.js (with a verbose flag) to emit waterfall charts of a page and its resources, for example—all in technicolor ASCII-art:

  1|------- |
  2| ------------ |
  3| ----------- |
  4| -------------------- |
  5| ----------- |
  6| -------- |
  7| -------- |
  8| -------- |
  9| -------- |
 10| ---------- |
 11| ---------------------- |
 12| ---- |
    ...

  1:1679ms;-b;http://cnn.com/
  2:3115ms;-b;http://www.cnn.com/
  3:2716毫秒;-b;http://z.cdn.turner.com/...css/hplib-min.css
  4:5465毫秒;-b;http://z.cdn.turner.com/...5/js/hplib-min.js
  5:2952毫秒;-b;http://z.cdn.turner.com/.../globallib-min.js
  6:1681毫秒;21b;http://content.dl-rms.co...r/5721/nodetag.js
  7:1698毫秒;-b;http://icompass.insightexpressai.com/97.js
  8:1743毫秒;-b;http://ad.insightexpress...px?publisherID=97
  9:1706毫秒;-b;http://js.revsci.net/gat...gw.js?csid=A09801
 10:2494ms;7732b;http://i.cdn.turner.com/...ader/hdr-main.gif
 11:5694毫秒;44091b;http://i2.cdn.turner.com...quare-t1-main.jpg
 12:1023毫秒;858b;http://i.cdn.turner.com/...earch_hp_text.gif
    ...
  1|-------                                                         |
  2|       ------------                                             |
  3|                 -----------                                    |
  4|                 ---------------------                          |
  5|                  -----------                                   |
  6|                  -------                                       |
  7|                  -------                                       |
  8|                  -------                                       |
  9|                  -------                                       |
 10|                                     ----------                 |
 11|                                     ----------------------     |
 12|                                     ----                       |
    ...

  1:   1679ms;       -b; http://cnn.com/
  2:   3115ms;       -b; http://www.cnn.com/
  3:   2716ms;       -b; http://z.cdn.turner.com/...css/hplib-min.css
  4:   5465ms;       -b; http://z.cdn.turner.com/...5/js/hplib-min.js
  5:   2952ms;       -b; http://z.cdn.turner.com/.../globallib-min.js
  6:   1681ms;      21b; http://content.dl-rms.co...r/5721/nodetag.js
  7:   1698ms;       -b; http://icompass.insightexpressai.com/97.js
  8:   1743ms;       -b; http://ad.insightexpress...px?publisherID=97
  9:   1706ms;       -b; http://js.revsci.net/gat...gw.js?csid=A09801
 10:   2494ms;    7732b; http://i.cdn.turner.com/...ader/hdr-main.gif
 11:   5694ms;   44091b; http://i2.cdn.turner.com...quare-t1-main.jpg
 12:   1023ms;     858b; http://i.cdn.turner.com/...earch_hp_text.gif
    ...

虽然这似乎是从 WebKit Web Inspector 工具中获得的丰富诊断的一个糟糕替代方案,但它确实提供了一种快速概述页面性能概况和潜在瓶颈的好方法。当然,更重要的是,它可以根据您的意愿轻松扩展、从命令行运行、自动化和集成。

While this might seem a poor alternative to the rich diagnostics that can be gained from, say, the WebKit Web Inspector tools, it does provide a nice way to get a quick overview of the performance profile—and potential bottlenecks—of a page. And, of course, and more importantly, it can be easily extended, run from the command line, automated, and integrated as you wish.

应用程序缓存清单

App Cache Manifest

同样,我们也可以使用无头浏览器来分析应用程序的实际内容,以便执行有用的任务。尽管 PhantomJS 中的 JavaScript 和页面的 JavaScript 之间存在运行时“中国墙”,但它的渗透性足以让我们根据 DOM 评估脚本函数,并将简单的结果结构返回到 recognize.js。

Similarly, we can also use a headless browser to analyze the application’s actual content in order to perform a useful task. Although there’s a run-time “Chinese wall” in PhantomJS between the JavaScript of the harness and the JavaScript of the page, it’s permable enough to allow us to evaluate script functions against the DOM and have simple results structures returned to confess.js.

为什么我们想要以自动化的方式分析页面的 DOM?好吧,以应用程序缓存清单机制为例:它提供了一种方法来强制浏览器应为给定应用程序显式缓存哪些资源,但是,尽管语法看似简单,但跟踪所有资源可能会令人沮丧。您使用过的资产。为了最大限度地发挥使用应用程序缓存的优势,您需要确保考虑每个资源:无论是图像、脚本、样式表,甚至是从这些资源内部进一步引用的资源。

Why might we want to analyze a page’s DOM in an automated way? Well, take the app cache manifest mechanism, for example: it provides a way to mandate to a browser which resources should be explicitly cached for a given application, but, despite a deceptively simple syntax, it can be frustrating to keep track of all the assets you’ve used. To maximize the benefits of using app cache, you want to ensure that every resource is considered: whether it’s an image, a script, a stylesheet—or even resources further referred to from inside those.

对于无头浏览器来说,这是一项完美的工作:加载文档后,我们可以检查它以识别它实际使用的资源。在真实浏览器中针对真实 DOM 执行此操作,比通过静态分析 Web 标记更有可能识别应用程序在运行时所需的依赖项。

This is the perfect job for a headless browser: once a document is loaded, we can examine it to identify the resources it actually uses. Doing this against the real DOM in a real browser makes it far more likely to identify dependencies required by the app at run-time than would be possible through statically analyzing web markup.

同样,类似的事情可以很容易地成为自动化构建和部署过程的一部分。例如:

And again, something like this could easily become part of an automated build-and-deploy process. For example:

$> phantomjs recognize.js http://calendar.perfplanet.com/2011/ appcache
$> phantomjs confess.js http://calendar.perfplanet.com/2011/ appcache

...将导致生成以下清单:

…will result in the following manifest being generated:

缓存清单

# 这个清单是由confess.js创建的,http://github.com/jamesgpearce/confess
#
# 时间:2011 年 12 月 23 日星期五 13:46:42 GMT-0800 (PST)
# 检索到的 URL:http://calendar.perfplanet.com/2011/
# 用户代理:Mozilla/5.0(Macintosh;Intel Mac OS X)AppleWebKit/534.34(KHTML,如 Gecko)PhantomJS/1.4.0 Safari/534.34

缓存:
/照片/aaron-70tr.jpg
/照片/alex-70tr.jpg
/照片/alois-70tr.jpg
[...]

http://calendar.perfplanet.com/wp-content/themes/wpc/globe.png

http://calendar.perfplanet.com/wp-content/themes/wpc/style.css

网络:
*
CACHE MANIFEST

# This manifest was created by confess.js, http://github.com/jamesgpearce/confess
#
# Time: Fri Dec 23 2011 13:46:42 GMT-0800 (PST)
# Retrieved URL: http://calendar.perfplanet.com/2011/
# User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X) AppleWebKit/534.34 (KHTML, like Gecko) PhantomJS/1.4.0 Safari/534.34

CACHE:
/photos/aaron-70tr.jpg
/photos/alex-70tr.jpg
/photos/alois-70tr.jpg
[...]

http://calendar.perfplanet.com/wp-content/themes/wpc/globe.png

http://calendar.perfplanet.com/wp-content/themes/wpc/style.css

NETWORK:
*

根据您的应用程序,此处可能会有大量输出。但就最终用户的浏览器而言,关键部分是 CACHE 和 NETWORK 块。后者始终设置为 * 通配符,但前者的显式资源列表是根据您运行该工具所针对的 URL 自动构建的。

Depending on your app, there might be a lot of output here. But the key parts, as far as the eventual user’s browser will be concerned, are the CACHE and NETWORK blocks. The latter is always set to the * wildcard, but the former list of explicit resources is built up automatically from the URL you ran the tool against.

对于应用程序缓存必杀技,您只需将此输出通过管道传输到文件,从<html>目标文档的元素链接到该文件,当然还要确保该文件在部署时是使用文本/缓存内容类型生成的-显现。

For app cache nirvana, you’d simply need to pipe this output to a file, link to it from the <html> element of your target document, and of course ensure that the file, when deployed, is generated with a content type of text/cache-manifest.

顺便说一句,confession.js 通过四种方式获取依赖资源列表本身。首先,一旦文档加载到 PhantomJS 中,就会遍历 DOM,并在、 和元素上查找 URLsrc和 属性。其次,遍历文档样式表的 CSSOM,并查找该类型的属性值。第三,遍历整个 DOM,该方法获取所有剩余资源。最后,该工具可以配置为监视其他网络请求,以防万一,例如,页面中的脚本发出了 DOM 或 CSSOM 内容无法预测的某些其他内容请求。hrefscriptimglinkCSS_URIgetComputedStyle

As an aside, the list of dependant resources itself is harvested by confess.js in four ways. First, once the document is loaded in PhantomJS, the DOM is traversed, and URLs sought in src and href attributes on script, img, and link elements. Second, the CSSOM of the document’s stylesheets is traversed, and property values of the CSS_URI type are sought. Third, the entire DOM is traversed, and the getComputedStyle method picks up any remaining resources. And last, the tool can be configured to watch for additional network requests—just in case, say, some additional content request has been made by a script in the page that would not have been predicted by the contents of the DOM or CSSOM.

(当然,有很多有用的方法来配置整个清单生成。您可以过滤入或出 URL,以便从远程域中排除某些文件类型或资源。您也可以在文档生成后等待一段时间在执行提取之前加载,以防您知道延迟脚本可能会添加对其他资源的引用。文档中包含有关所有这些的信息(https://github.com/jamesgpearce/confess/blob/master/README)。医学博士)。)

(Naturally, there are many useful ways to configure the manifest generation as whole. You can filter in or out URLs in order to, say, exclude certain file types or resources from remote domains. You can also wait for a certain period after the document loads before performing the extraction, in case you know that a deferred script might be adding in references to other resources. There’s information about all this in the docs (https://github.com/jamesgpearce/confess/blob/master/README.md).)

步步高升

Onward and Upward

我们刚刚接触了两个简单的例子,说明了无头浏览器方法通常可以做什么。该技术提供了一种强大的方法来分析 Web 应用程序,并更接近于了解真实用户的体验和真实应用程序的行为。

We’ve just touched on the two simple examples of what can be done with a headless browser approach in general. The technique provides a powerful way to analyze web applications, and get closer to being able to understand real users’ experience and real apps’ behavior.

我当然强烈建议您查看PhantomJS,尝试编写一些简单活动的脚本,并考虑如何使用它来理解和自动化网站和应用程序行为。(我什至不确定我是否提到过它也有截取屏幕截图的功能。)当然,也请随意尝试一下conference.js——它的目标是让一些事情变得更容易自动化。这些常见任务。我总是接受拉取请求!

I’d certainly urge you to check out PhantomJS, try scripting some simple activities, and think about how you can use it to understand and automate website and application behavior. (I’m not even sure I mentioned yet that it has the capability to take screenshots, too.) And of course, feel free to give confess.js a try, too—with its humble goal of making it easier to help automate some of those common tasks. I’m always accepting pull requests!

但无论您选择什么工具,一定要享受性能冒险的乐趣,挑战极限,让 Web 成为一个美妙的地方。

But whatever your tools of choice, do have fun on your performance adventures, push the envelope, make the Web a wonderful place.

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/losing-your-head-with-phantomjs-and-confess-js/。最初发布于 2011 年 12 月 30 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/losing-your-head-with-phantomjs-and-confess-js/. Originally published on Dec 30, 2011.

第31章 测量两次,切割一次

Chapter 31. Measure Twice, Cut Once

汤姆 ·休斯·克劳彻

Tom Hughes-Croucher

英语有句名言:“测量两次,切割一次”,如果你用手做任何事情,这一点尤其重要。一旦你用锯子锯完一块木头,发现短了 5 毫米,就很难修复它。虽然软件很难像浪费木材这样的原材料一样浪费,但你肯定会浪费时间。

There is a famous saying in English, “Measure twice, cut once” which is especially important if you do anything with your hands. Once you’ve cut a piece of wood with a saw and you find you are 5mm too short, it’s pretty hard to fix it. While software is hard to waste in the same way you can waste a raw material like wood, you can certainly waste your time.

像这本书这样的资源是一个非常好的工具,可以帮助你找到想法并将其应用到你自己的工作中。本书的许多作者都很幸运,因为他们花费了大量时间为 Facebook、Yahoo! 和 Google(还有您的沃尔玛和其他公司)等公司优化大型网站。然而,大多数开发人员除了性能之外还有许多其他责任。

A resource like this book is a really great tool for finding ideas to apply to your own work. Many of the authors of this book are lucky in that they spend a significant amount of their time optimizing large sites for companies like Facebook, Yahoo!, and Google (and yours truly, Walmart and others). However most developers have lots of other responsibilities other than just performance.

当你手头有很多事情要做时,衡量更多的事情就更有意义了。虽然很容易掌握别人为您设计的技术并应用它(您应该这样做),但确保您针对对您的网站影响最大的问题也很重要。几年前,我参加了一次有关 JavaScript 的会议,一位非常杰出、才华横溢且非常聪明的 JavaScript 专家发表了有关性能优化的演讲。他给出了许多深入的技巧,包括展开循环和其他微观优化。

When you have lots of things on your plate, measuring more than pays its way. While it is easy to grab a technique that someone has laid out for you and apply it (and you should), it is also important to make sure you target the issues that affect your site the most. I was at a conference a few years ago about JavaScript and an extremely prominent, talented, and altogether smart JavaScript expert gave a talk about performance optimization. He gave a number of in-depth tips including unrolling loops and other micro-optimizations.

事情是这样的:当您是一个框架的作者,每小时都有数千个网站使用您的代码时,您花在优化代码上的时间就会在每个网站上得到回报。如果您使辅助函数能够反复使用,那么您的工作会通过每次小的使用而获得数倍的回报。然而,当您只关心您维护的一个站点时,展开循环可能不会对您的用户产生重大或明显的影响。优化就是选择正确的目标。

Here is the thing: when you are the author of a framework used by many thousands of sites every hour you spend optimizing the code pays off on every one of those sites. If you make helper functions to use over and over, your work repays itself many fold through each small usage. However, when you only care about the one site you maintain, unrolling loops probably won't make a significant or obvious a difference to your users. Optimization is all about picking the correct targets.

这是我们再次测量的地方。当你不清楚自己的瓶颈在哪里时,你需要在削减之前进行衡量。衡量性能可以通过多种方式来完成,这一点也很重要。JavaScript 中展开循环是一种非常原子的微优化。它改进了一项特定功能。然而,展开一个仅循环两次且仅被 1% 用户使用的循环显然不是一个重要的时间利用。

This is where we come back to measuring again. When you don’t have a clear understanding of where your bottlenecks are, you need to measure before you cut. Measuring performance can be done in many ways and this is also important to consider. Unrolling loops in JavaScript is a very atomic micro-optimization. It improves one specific function. However, unrolling a loop that loops only twice and is only used by 1% of users is clearly not an important use of time.

测量的关键是仪器仪表。从宏观层面开始。您的网站最重要的部分是什么?这些可能是最常用的,或者对您的业务影响最大的(例如结账流程)。您可能会发现自己感到惊讶,也许您收到大量搜索引擎流量到您网站深处优化不佳的页面。将该页面改进 50% 可能会比花费相同的时间在已经优化的主页上再改进 1% 产生更大的影响。真正了解网站上哪些页面重要的唯一方法是查看统计数据或与网站负责人讨论优先事项。

The key to measurement is instrumentation. Start at a macro level. What are the most important parts of your site? These might be the ones used the most, or the ones that have the most impact on your business (such as the checkout process). You might find yourself surprised, perhaps you receive a lot of search engine traffic to a page deep in your site that is poorly optimized. Improving that page by 50% might make a much bigger impact than spending the same time getting another 1% improvement on your already optimized homepage. The only way to really know which pages on your site are important is to look at the stats or to discuss priorities with whoever is in charge of the site.

一旦您知道什么是重要的,下一个任务就是弄清楚用户使用这些页面做什么,或者再次了解您希望他们做什么。在此过程中需要注意的是,客户现在所做的事情可能是当前站点的一个属性,而不是您实际上希望他们做的事情。通过查找页面上最常见的任务来确定网站的哪些部分使用最多。用户与哪些页面级项目(菜单、搜索结果)交互最多?

Once you know what’s important, the next task is to figure out what users do with those pages, or again what you want them to do. It’s important to note in this process that what customers do now may be an attribute of the current site and not actually what you want them to do. Identify which parts of your site are used the most by finding the most common tasks on the page. Which page level items (menus, search results) do users interact with most?

这是我们的优化公式:

Here is our formula for optimizing:

  • 步骤 1. 使用检测来选择要优化的页面/部分

  • Step 1. Use instrumentation to pick which pages/sections to optimize

  • 步骤 2. 使用检测来选择要优化的功能

  • Step 2. Use instrumentation to pick which features to optimize

  • 步骤 3. 优化

  • Step 3. Optimize

测量两次,切割一次。

Measure twice, cut once.

识别页面/部分

Identifying Pages/Sections

您如何选择要优化网站的哪些页面或部分?这可能是最简单的任务之一,因为大多数传统指标都会为您提供您需要了解的一切。首先查看哪些页面获得最多的浏览量。这将为您提供一个简短的明显目标列表。您的主页几乎肯定是其中之一,然后是您网站上的其他热门页面。这些应该是您的最终清单。

How do you go about picking which pages or sections of your site to optimize? This probably one of the easiest tasks because most conventional metrics give you everything you need to know. Start by seeing which pages get the most views. This will give you a short list of obvious targets. Your homepage is almost certainly one them, and then other popular pages on your site. These should be your short list.

接下来要做的就是与您的企业主交谈。那可能是你的项目经理、首席执行官,无论是谁。最受欢迎的页面并不总是对企业最重要的。结帐和购物车是非常明显的例子。如果您经营一个电子商务网站,很多人会浏览很多商品,但只有一小部分人会结帐。这并不意味着退房不重要。相反。结帐确实很重要,只是指标可能无法帮助您确定优先级。

The next thing to do is talk to your business owner. That might be your project manager, CEO, whoever. The most popular pages are not always the most important to the business. Checkout and shopping cart are very obvious examples here. If you run an e-commerce site many many people will browse many items, but only a small percentage of people will check out. This doesn’t mean check-out isn’t important. On the contrary. Checkout is really important, it’s just something that metrics may not help you prioritize.

现在,您应该拥有网站的页面或部分的列表,其中包含对企业最流行或最重要的页面或部分。这是你的命中列表。定期保持最新状态。在用完命中列表之前,不要担心其他性能问题。

Now you should have a list of the pages or sections of your site that are a mix of the most popular or important ones to the business. This is your hit list. Keep it up-to-date periodically. Until you’ve exhausted your hit list don’t bother with other performance issues.

识别特征

Identifying Features

在现代网站上,许多页面在许多页面上共享相同的代码。查看代码以找到这些功能,或在命中列表页面上使用数据包嗅探器,例如 WiresharkCharles Proxy或 Chrome Inspector。这将帮助您获取大多数页面使用的外部资源(CSS、脚本、图像等)的列表。您还可以检查 HTTP 日志,以了解这些热门页面正在请求哪些数据资源(Web 服务)。这些资源也可能成为页面渲染的阻碍因素。

On modern websites many pages share the same code on many pages. Looking at the code to find these features or use a packet sniffer like Wireshark, Charles Proxy, or the Chrome Inspector on your hit list pages. This will help you get a list of the external resources (CSS, scripts, images, etc.) that were used by the most pages. You can also examine your HTTP logs to look at what data resources (web services) are being requested for those popular pages. Those resources could also be a blocking factor in page rendering.

您还应该尝试确定用户在每个页面上执行的操作。这可能很困难。除非您有非常丰富的指标系统,否则您可能不知道用户的光标在哪里,或者滚动了多少。然而,您可能可以做的是查看他们通常从您的历史列表页面点击的位置。这将使您了解最常用的内容。例如,在产品描述页面上,它可能是“添加到购物车”按钮。您还应该考虑时间安排,一般来说,导航菜单项之类的东西在渲染后会比“添加到购物车”按钮更快地被点击。这是因为人们买东西时通常会先阅读产品说明。当他们进行导航时,他们还没有阅读页面内容。 回旋镖

You should also try to identify what your users are doing on each page. This can be difficult. Unless you have a very rich metrics system you probably don't know where the users’ cursors are, or how much they scroll. What you can probably do, however is look at what where they commonly click to from your history list pages. This will give you an idea of what is being used the most. For example, on an product description page it might be the “Add to Cart” button. You should also look at timing, things like navigation menu items are going to get clicked a lot sooner after rendering than an “Add to Cart” button in general. This is because when people buy things, they normally read the product description first. When they are navigating, they aren’t reading page content yet. You can instrument your pages with JavaScript or you can compute the time between page loads per user if you want to be a clever-clogs using a project like Boomerang.

一般来说,目标是找出用户最容易需要哪些东西。作为非正式的经验法则,请考虑按以下顺序优先加载项目:

In general the goal is to figure out which things the user will need most readily. As an informal rule of thumb consider prioritizing items to load in this order:

  • 首屏项目

  • Items above the fold

  • 导航项(菜单、搜索栏)

  • Navigation item (Menus, search bar)

  • 提供信息的项目(产品描述、新闻报道)

  • Items that provide information (Product description, News stories)

  • 要采取行动的项目(添加到购物车等)

  • Items to take an action (Add to cart, etc)

  • 首屏以下的项目

  • Items below the fold

您可以使用WebPageTest的幻灯片功能检查网站上各种内容的加载速度 。

You can check how fast various things load on your site by using WebPageTest's film strip feature.

优化

Optimizing

最后一步当然是优化。请记住,即使在优化某个功能时,也不要将所有时间都花在优化已经优化的内容上,因为有 90% 的使用量与未优化的内容相同。这就是指标的意义所在,可以做出正确的决策。这既适用于您的页面和功能列表,也适用于代码。优化的目标应该是进行测量,然后充分利用您的时间来影响用户体验。查看页面渲染和 JavaScript 分析器和技术。那里有很多资源,一旦您知道需要优化什么,就去找一些东西来解决您的问题,然后测量,再次测量。

The final step is, of course, optimizing. Remember even within optimizing a feature, don’t spend all your time optimizing something that is already optimized when there is something used 90% as much that isn’t. That's the point of metrics, to make good decisions. This goes both for your list of pages and features, and within the code. The goal of optimizing should be to take your measurements and then make the best use of your time to affect the users’ experience. Check out page rendering and JavaScript profilers and techniques. There are lots of resources out there, once you know what you need to optimize, go and find something to solve your problem, and then measure, measure again.

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/measure-twice-cut-once/。最初发布于 2011 年 12 月 31 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/measure-twice-cut-once/. Originally published on Dec 31, 2011.

第 32 章当好的后端变坏时

Chapter 32. When Good Backends Go Bad

帕特里克 ·米南

Patrick Meenan

有大量研究 ( http://www.yuiblog.com/blog/2006/11/28/performance-research-part-1/ ) 告诉我们 80% 到 90% 的时间花在加载上网页花费在前端(浏览器拉入 CSS、JavaScript 和图像等外部资源),而典型页面只有 10% 到 20% 的时间花费在后端。虽然一般来说确实如此,并且有很多工具专注于为您提供有关改进前端代码的建议WebPagetestPage SpeedY-Slow,但看到后端性能问题并不罕见,特别是当您远离将顶级互联网网站纳入互联网长尾。

There has been a fair amount of research (http://www.yuiblog.com/blog/2006/11/28/performance-research-part-1/) that tells us that 80% to 90% of the time spent loading web pages is spent on the frontend (browser pulling in external resources like CSS, JavaScript, and images) and only 10% to 20% of the time for a typical page is spent on the backend. While that is true in general, and there are a lot of tools that focus on giving you suggestions on improving your frontend code WebPagetest, Page Speed, Y-Slow, it is not uncommon to see backend performance issues, particularly as you move away from the top Internet sites into the long tail of the Internet.

这并不完全出乎意料,因为顶级网站往往有专门的开发人员,他们定制用于服务页面的后端代码,有专门的运营团队来监视系统和数据库的性能,并花费大量时间专注于性能和数据库的性能。后端的可扩展性。

This is not entirely unexpected because the top sites tend to have dedicated developers who custom-built the backend code for serving pages, have dedicated operations teams that watch the performance of the systems and databases, and spend a lot of time focused on the performance and scalability of the backends.

当您脱离顶级互联网出版商时,您开始遇到在现成内容系统(Drupal、WordPress、Joomla 等)上运行的网站,以及与以下网站签订了网站开发合同的所有者:在某个时间点或使用并调整了可用的模板,然后使用一组插件来组合他们的网站(通常不知道插件本身如何工作)。这些网站的托管也有很大差异,从专用服务器到 VPS 系统,再到在共享托管上运行(迄今为止最常见),他们对网站运行的实际系统的性能几乎没有了解。

As you move out of the top tier of Internet publishers, you start running into sites that are running on off-the-shelf content systems (Drupal, WordPress, Joomla, etc.), and with owners who either contracted for the site development at one point in time or used and tweaked an available template and then used a collection of plug-ins to put together their site (often not knowing how the plug-ins themselves work). The hosting for these sites also varies wildly from dedicated servers to VPS systems to running on shared hosting (by far the most common) where they have little to no insight on the performance of the actual systems their site is running on.

因此,如图 32-1所示的情况并不罕见。

As a result, it’s not uncommon to see something like shown on Figure 32-1.

30 秒 TTFB

图 32-1。30 秒 TTFB

Figure 32-1. 30-second TTFB

是的,第一个字节 (TTFB) 需要 30 多秒的时间,所有时间都花在后端的某个地方来组装和生成页面。这也不是异常值。对于此页面,每个 页面加载都需要 30 秒以上,然后浏览器才能处理第一个 HTML 部分。

Yes, that is a 30+ second time to first byte (TTFB) with all of the time being spent somewhere on the backend to assemble and produce the page. This wasn’t an outlier either. For this page, every page load takes 30+ seconds before the browser even gets the first bit of HTML to work on.

这并不是本网站或其运行的内容管理系统 (CMS) 所独有的(尽管这是一个极端的例子)。几乎所有不同的 CMS 系统的后端时间为 8 到 20 秒的情况并不少见(图 32-2)。

This isn’t unique to this site or the Content Management System (CMS) it runs on (though it is an extreme example). It is not uncommon to see and 8-to-20 second backend times from virtually all the different CMS systems (Figure 32-2).

12 秒 TTFB

图 32-2。12 秒 TTFB

Figure 32-2. 12-second TTFB

这对于用户来说确实很痛苦(假设他们中的任何一个人实际上等待该站点那么长时间),但它也会导致后端的扩展问题,因为应用程序需要花费很长时间来处理每个请求,从而导致其他用户可用的资源更少。

This is really painful for users (assuming any of them actually wait that long for the site), but it also causes scaling problems for the backend because the application is tied up for a long time processing each request, making fewer resources available for other users.

什么是好的后端时间?

What Is a Good Backend Time?

后端请求处理时间的一个很好的目标是 100 毫秒(0.1 秒)左右。这并不意味着您应该期望 TTFB 为 100 毫秒,只是后端处理时间不应比这更长。重要的是要记住,用户 在 TTFB 之前根本看不到任何内容,因此任何改进都会直接影响用户体验。

A good target for just the processing time for backend requests is on the order of 100ms (0.1 seconds). That doesn’t mean you should expect a TTFB of 100ms, just that the backend processing time shouldn’t take longer than that. It is important to remember that the user can’t see anything at all before the TTFB, so any improvements there go directly to the user experience.

当通过 WebPagetest 等前端工具计算后端时间时,您需要记住包括网络延迟。为此,我通常使用服务器的套接字连接时间(橙色条)作为 RTT,然后将其用作其他所有内容的基线(图 32-3)。

When figuring out the backend time from a frontend tool like WebPagetest, you need to remember to include the network latency. For that, I usually use the socket connect time to the server (orange bar) as the RTT and then use that as a baseline for everything else (Figure 32-3).

1.5 秒 TTFB

图 32-3。1.5 秒 TTFB

Figure 32-3. 1.5-second TTFB

在本例中,DNS 查找时间(青色条)花费的时间比我预期的要长,但您想要将橙色条的大小与浅绿色条的大小进行比较。橙色条的长度是服务器能够回复的最快速度,并假设后端处理时间为 0,因此如果它们的大小相当接近,那么您的状态就非常好。

In this case, the DNS lookup time (teal bar) is taking longer than I would expect but you want to compare the size of the orange bar to the size of the light green bar. The length of the orange bar is the fastest the server would be able to reply and assumes 0 backend processing time, so if they are reasonably close in size then you’re in pretty good shape.

目视瀑布有助于获得总体感觉,但如果您想了解具体细节,您可以在 WebPagetest 上瀑布下方的数据表中获取各个组件时间(图 32-4

Eyeballing waterfalls is good for a general feeling but if you want to see the specifics, you can get the individual component times in a data table below the waterfalls on WebPagetest (Figure 32-4).

请求时间详细信息

图 32-4。请求时间详细信息

Figure 32-4. Request timing details

在这种情况下,您只需从 TTFB 中减去初始连接时间,就可以得到后端花费的时间(此处为 436 毫秒)。

In this case, you just subtract the initial connection time from the TTFB and you have the amount of time that was spent on the backend (436ms here).

弄清楚发生了什么

Figuring Out What Is Going On

那么,您知道自己遇到了后端问题,那么如何找出导致问题的原因呢?

So, you know you have a backend issue, how do you figure out what is causing the problem?

该问题几乎肯定是由以下问题之一引起的:

The problem is almost certainly caused by one of these issues:

  • 没有可用客户端来处理请求的 Web 服务器配置

  • Web server configuration that is out of available clients to process requests

  • 数据库查询速度慢

  • Slow database queries

  • 后端调用外部服务

  • Backend calls to external services

不幸的是,您习惯使用的大多数性能工具对这些组件没有任何可见性,它们变成了黑匣子。此时,您需要一名开发人员和一名系统管理员(或具有同时完成这两项工作的技能的人员),因为修复该问题将涉及代码或站点配置更改。即使只是找到问题的根源也需要相当不错的技能。

Unfortunately, most of the performance tools you are used to using don’t have any visibility into those components and they become a black box. At this point, you need a developer and a sysadmin (or someone with the skillset to do both) because fixing it is going to involve code or site configuration changes. Even just finding the source of the problem requires a pretty decent skillset.

有一些商业解决方案可以通过最少的工作快速地为您识别问题。实际上,有一个完整的部门专注于它(称为应用程序性能管理或 APM)。我将在这里使用 New Relic ( http://newrelic.com/ ) 作为示例,因为它是我在网页测试.org 上使用的,但 Dynatrace ( http://www.dynatrace.com/ ) 是另一个常见的解决方案。不过,所有这些都要求您在服务器上安装二进制代码,因此,如果您使用共享托管,这些可能不可用(而且一旦您完成免费试用阶段,大多数费用都比共享托管计划高)。

There are commercial solutions that will identify the issue for you really quickly with minimal work. Actually, there is a whole sector focused on it (called Application Performance Management or APM). I’ll use New Relic (http://newrelic.com/) as an example here because it is what I use on webpagetest.org but Dynatrace (http://www.dynatrace.com/) is another common solution. All of them require that you install binary code on the server though, so if you are on shared hosting these may not be available options (and once you get through the free trial phase most cost more than shared hosting plans anyway).

配置完成后,APM 工具将监控您的生产系统并告诉您服务器在各个不同层花费了多少时间(图 32-5)。

Once configured, the APM tools will monitor your production systems and tell you how much time your server is spending in the various different tiers (Figure 32-5).

新遗物概要

图 32-5。新遗物概要

Figure 32-5. New Relic summary

我已经对 WebPagetest 进行了相当多的调整,因此这里没有太多可看的内容。平均响应时间约为 10 毫秒,数据库仅用于论坛,因此大部分时间都花在实际的应用程序代码上。

I’ve done a fair bit of tuning to WebPagetest, so there’s not a whole lot to see here. Average response times are ~10ms and the database is only used for the forums so the bulk of the time is spent in the actual application code.

从那里您可以深入了解每个频段以准确查看时间的去向(图 32-6)。

From there you can drill into each band to see exactly where that time is going (Figure 32-6).

新遗迹交易

图 32-6。新遗迹交易

Figure 32-6. New Relic transactions

就我而言,大部分 CPU 时间都花在生成结果页面的缩略图(包括瀑布缩略图)上。并不完全出乎意料,因为它们都是由代码动态生成的。

In my case, most of the CPU time is spent generating thumbnail images (which includes waterfall thumbnails) for the results pages. Not completely unexpected since they are all generated dynamically by code.

我花了相当多的时间来优化缩略图生成,因为它过去需要消耗大量资源,并且占用了接近 80% 的时间。这些工具可让您继续深入了解哪些特定功能对时间有贡献(图 32-7)。

The thumbnail generation is something I spent a fair amount of time optimizing because it used to be a lot more resource intensive and took close to 80% of the time. The tools let you keep drilling in to see what specific functions contribute to the time (Figure 32-7).

新遗物缩略图详细信息

图 32-7。新遗物缩略图详细信息

Figure 32-7. New Relic thumbnail details

它们允许您对数据库调用执行相同的操作,对于特别慢的请求,它们将为单个请求提供诊断,而不仅仅是聚合结果,因此您还可以轻松深入了解慢速异常值。

They let you do the same for database calls, and for particularly slow requests, they will provide diagnostics for individual requests instead of just aggregate results so you can also drill into slow outliers easily.

如果您没有足够的运气能够使用这些工具,那么您必须查看可用于您的平台的工具,看看是否有免费的诊断工具,或者您必须开始自己检测代码。例如,在 WordPress 中,有几个插件可以调试数据库查询并告诉您它们花费了多长时间。

If you aren’t fortunate enough to be able to use the tools, then you have to look into what is available for your platform to see if there are free diagnostic tools or you have to start instrumenting the code yourself. In WordPress, for example, there are several plug-ins that will debug the database queries and tell you how long they are taking.

W3 Total Cache 是一个用于提高 WordPress 性能的有用插件,但它还提供调试信息,帮助您识别任何缓慢的数据库调用(图 32-8)。

W3 Total Cache is a useful plug-in for improving WordPress performance but it also provides debugging information that will help you identify any slow database calls (Figure 32-8).

W3 总缓存调试设置

图 32-8。W3 总缓存调试设置

Figure 32-8. W3 Total Cache debug settings

当您启用调试信息时,有关每个数据库查询(和缓存操作)的详细信息将作为注释记录到页面 HTML 中,您可以通过访问该页面并查看页面源代码来查看该注释(图 32-9

When you enable the debug information, details about every database query (and cache operation) will be logged into the page HTML as a comment that you can view by visiting the page and viewing the page source (Figure 32-9).

W3 总缓存调试数据

图 32-9。W3 总缓存调试数据

Figure 32-9. W3 Total Cache debug data

您将获得数据库查询所花费的总时间以及每个查询的时间和详细信息。

You’ll get the overall time spent in database queries as well as timings and details for each and every query.

修复它

Fixing It

太好了,既然您已经确定了问题,那么真正的艰苦工作就开始了。人们使用的最常见的“解决方案”是添加缓存来隐藏问题。这可以采用 W3 Total Cache 等插件的形式,让您可以使用 memcache 将各种不同的操作缓存到自定义查询缓存中。缓存是绝对必要的,但您应该在启用缓存之前尽可能地改善底层问题,这样 100% 的请求都会获得改进的性能。

Great, so now that you’ve identified the issues the real hard work starts. The most common “solution” people use is to add caching to hide the problem. This can be in the form of a plug-in like W3 Total Cache that will let you cache all sorts of different operations to custom query caches by using memcache. Caches are absolutely necessary but you should improve the underlying issue as much as possible before enabling caching, that way 100% of the requests will get improved performance.

最后

Finally

正如木工行业所说,测量两次,切割一次。在测量用户体验之前不要优化您的网站,然后使用测量结果来指导您的工作,而不是使用各种工具的成绩或分数 - 它们可能与您的特定情况无关。仅仅因为网站通常将大部分时间花在前端并不意味着您的网站也一定是这种情况。

As they say in carpentry, measure twice, cut once. Don’t go optimizing your site until you have measured the user experience and then use the measurements to guide your work, not grades or scores from various tools—they may not be relevant to your particular situation. Just because sites normally spend most of their time on the frontend doesn’t mean that is necessarily the case for yours.

笔记

Note

要对本章发表评论,请访问http://calendar.perfplanet.com/2011/when-good-back-ends-go-bad/。最初发布于 2011 年 12 月 31 日。

To comment on this chapter, please visit http://calendar.perfplanet.com/2011/when-good-back-ends-go-bad/. Originally published on Dec 31, 2011.

第 33 章 Web 字体性能:权衡 @font-face 选项和替代方案

Chapter 33. Web Font Performance: Weighing @font-face Options and Alternatives

戴夫 ·阿茨

Dave Artz

网络字体是当今网站设计的关键要素;在我的雇主(AOL),重新设计将采用可下载字体是理所当然的。维护充满图形文本标题的精灵的日子已经过去了。我们已经继续前进,但是什么方法可以产生最佳性能呢?

Web fonts are a key ingredient in today’s website designs; at my employer (AOL) it is a given that redesigns will feature downloadable fonts. The days of maintaining a sprite full of graphic text headlines are behind us. We’ve moved on—but what approach yields the best performance?

本章的目标是了解各种可用的网络字体实现选项,对其性能进行基准测试,并为您提供一些有用的技巧,以最大限度地发挥字体字节的作用。我什至会添加一个新的字体加载器作为特别奖励!

The goal of this chapter is to look at the various web font implementation options available, benchmark their performance, and arm you with some useful tips in squeezing the most bang for your font byte. I will even throw in a new font loader as a special bonus!

字体托管服务与您自己的字体托管服务

Font Hosting Services Versus Rolling Your Own

您可以采用两种方法在网页上获取许可的可下载字体:字体托管服务和自己动手 (DIY)。

There are two approaches you can take to get licensed, downloadable fonts on to your web pages: font hosting services and do-it-yourself (DIY).

字体托管服务
Font hosting services

Typekit、Fonts.com、Fontdeck 等为设计人员提供了一个简单的界面来管理购买的字体,并生成指向提供该字体的动态 CSS 或 JavaScript 文件的链接。谷歌甚至免费提供这项服务。Typekit 是唯一提供额外字体提示的服务,以确保字体在浏览器中占据相同的像素。

Typekit, Fonts.com, Fontdeck, etc., provide an easy interface for designers to manage fonts purchased, and generate a link to a dynamic CSS or JavaScript file that serves up the font. Google even provides this service for free. Typekit is the only service to provide additional font hinting to ensure fonts occupy the same pixels across browsers.

DIY方法
The DIY approach

这涉及购买获得网络使用许可的字体,以及(可选)使用 FontSquirrel 生成器等工具来优化其文件大小。然后,使用标准@font-face CSS的跨浏览器实现(http://www.fontspring.com/blog/the-new-bulletproof-font-face-syntax/)来启用字体。这种方法最终提供了最佳性能。

This involves purchasing a font licensed for web use, and (optionally) using a tool like FontSquirrel’s generator to optimize its file size. Then, a cross-browser implementation (http://www.fontspring.com/blog/the-new-bulletproof-font-face-syntax/) of the standard @font-face CSS is used to enable the font(s). This approach ultimately provides the best performance.

这两种方法都使用标准的 @font-face CSS3 声明,即使是通过 JavaScript 注入也是如此。像 Google 和 Typekit 使用的 JS 字体加载器(即 WebFont 加载器 ( https://developers.google.com/webfonts/docs/webfont_loader ))提供 CSS 类和回调来帮助管理可能发生的“FOUT”,或下载字体时响应超时。

Both approaches make use of the standard @font-face CSS3 declaration, even when injected via JavaScript. JS font loaders like the one used by Google and Typekit (i.e., WebFont loader (https://developers.google.com/webfonts/docs/webfont_loader)) provide CSS classes and callbacks to help manage the “FOUT” that may occur, or response timeouts when downloading the font.

什么是FOUT?

What the FOUT?

FOUT 或“Flash of Unstyled Text”由 Paul Irish 创造 ( http://paulirish.com/2009/fighting-the-font-face-fout/ ),是在网络字体之前短暂显示备用字体被下载并渲染。这可能会带来不和谐的用户体验,尤其是在字体样式明显不同的情况下。

FOUT, or “Flash of Unstyled Text,” was coined by Paul Irish (http://paulirish.com/2009/fighting-the-font-face-fout/) and is the brief display of the fallback font before the web font is downloaded and rendered. This can be a jarring user experience, especially if the font style is significantly different.

某种形式的 FOUT 存在于所有版本的 Internet Explorer 和 Firefox 3.6 及更低版本中。您可以观看我的演示视频 ( http://www.artzstudio.com/files/font-performance/fout-demo.html ),最好是在全屏模式下,在 1.6 秒标记处查看它的实际效果。图33-1为1.6s时的视频截图。

FOUT of some form exists in all versions of Internet Explorer and Firefox 3.6 and lower. You can check out the video of my demo (http://www.artzstudio.com/files/font-performance/fout-demo.html), preferably in full screen mode, at the 1.6 second mark to see it in action. Figure 33-1 shows a screenshot of the video at 1.6s.

输出

图 33-1。输出

Figure 33-1. FOUT

您会注意到,在 Internet Explorer 9 中,在图像下载 ( http://www.webpagetest.org/video/compare.php?tests=120108_PQ_2SH9D-r:1-c:0 )之前,内容会被阻止。你的猜测和我的一样好。

You’ll notice in Internet Explorer 9, the content is blocked until the image has downloaded (http://www.webpagetest.org/video/compare.php?tests=120108_PQ_2SH9D-r:1-c:0). Your guess is as good as mine.

以下是我对避免 FOUT 的建议:

Here are my recommendations for avoiding the FOUT:

还有 FOUT 吗?继续阅读,JavaScript 字体加载器可能是合适的。

Still have a FOUT? Read on, a JavaScript font loader may be in order.

删除多余的字体字形

Removing Excess Font Glyphs

Font Squirrel 有一个很棒的工具 ( http://www.fontsquirrel.com/fontface/generator ),它可以让您获取桌面字体文件并生成其 Web 对应版本。它还允许您获取字体的子集,从而显着减小文件大小。

Font Squirrel has an awesome tool (http://www.fontsquirrel.com/fontface/generator) that lets you take a desktop font file and generate its web counterparts. It also allows you to take a subset of the font, significantly reducing file size.

为了显示其重要性,我添加了 Open Sans 并尝试了所有三种设置(图 33-2)。

To show just how significant, I added Open Sans and tried all three settings (Figure 33-2).

消除多余的字形

图 33-2。消除多余的字形

Figure 33-2. Excess glyphs elimination

从图 33-2的表中可以明显看出,字节大小与字体文件中的字形(字符)数量直接相关。

From the table on Figure 33-2, it should be obvious that the byte size is directly correlated to the number of glyphs (characters) in the font file.

我建议你在Fontsquirrel上跟随我!

I suggest you follow along with me at Fontsquirrel!

基本设置使角色保持不变。最佳方式将字符减少到大约 256 个(Mac Roman 字符集)。通过选择专家模式并仅包含基本拉丁语集,然后手动添加我们需要的字符,我们可以看到最大的节省 。

The Basic setting leaves the characters untouched. Optimal reduces the characters to around 256, the Mac Roman character set. We are able to see the greatest savings by selecting Expert mode and only including the Basic Latin set, then manually adding in the characters we need.

以下是我推荐的专家 FontSquirrel 设置(截图: http ://www.artzstudio.com/files/font-performance/fontsquirrel-generator-settings.png ):

Here are my recommended Expert FontSquirrel settings (screenshot: http://www.artzstudio.com/files/font-performance/fontsquirrel-generator-settings.png):

  • 在“渲染”下,取消选中“修复垂直指标”。

  • Under Rendering, uncheck Fix Vertical Metrics.

  • 在子集设置下,选中自定义子集设置。

  • Under Subsetting, check Custom Subsetting.

  • 在 Unicode 表下,选中基本拉丁语。

    笔记

    这假设字体仅使用英文字符;对于其他语言,请添加您需要的字符。

  • Under Unicode Tables, check only Basic Latin.

    Note

    This assumes the fonts will use only English characters; for other languages, add the characters you need.

  • 如果您是排版迷,请将“ ' ”复制并粘贴到“单个字符”字段中。

  • If you are typography nerd, copy and paste ' ' " " into the Single Characters field.

  • 验证您的子集预览;如果需要的话进行调整(图33-3)。

  • Verify your Subset Preview; adjust if needed (Figure 33-3).

  • 在“高级选项”下,根据子集为您的字体指定后缀(即latin)。

  • Under Advanced Options, give your font a suffix based on the subset (i.e., latin).

子集预览

图 33-3。子集预览

Figure 33-3. Subset preview

JavaScript 字体加载器

JavaScript Font Loaders

Typekit 和 Google 联手创建了一个开源 WebFont Loader ( https://developers.google.com/webfonts/docs/webfont_loader ),它提供 CSS 和 JavaScript 挂钩,指示字体下载时的状态。这对于通过隐藏文本并调整 CSS 属性来标准化跨浏览器的 FOUT ( http://24ways.org/2010/using-the-webfont-loader-to-make-browsers-behave-the-same )很有用。两种字体占据相同的宽度。

Typekit and Google joined forces to create an open source WebFont Loader (https://developers.google.com/webfonts/docs/webfont_loader) that provides CSS and JavaScript hooks indicating a font’s status as it downloads. This can be useful in normalizing the FOUT across browsers (http://24ways.org/2010/using-the-webfont-loader-to-make-browsers-behave-the-same) by hiding the text and adjusting CSS properties so that both fonts occupy the same width.

它跟踪的三种状态是加载、活动和非活动(超时)。相应的 CSS 类(wf-loadingwf-activewf-inactive)可用于控制 FOUT,方法是首先隐藏标题,然后在下载后显示它们:

The three states it tracks are loading, active, and inactive (timeout). Corresponding CSS classes (wf-loading, wf-active, and wf-inactive) can be used to control the FOUT by first hiding headings and then showing them once they’re downloaded:

h1 {
    visibility: hidden;
}
.wf-active h1 {
    visibility: visible;
}
h1 {
    visibility: hidden;
}
.wf-active h1 {
    visibility: visible;
}

这些相同事件的 JavaScript 挂钩也可以通过配置对象中的回调获得:

JavaScript hooks for these same events are also available via callbacks in the configuration object:

WebFontConfig = {
    google: {
        families: [ 'Tangerine', 'Cantarell' ] // Google example
    },
    typekit: {
        id: 'myKitId' // Typekit example
    },
    loading: function() {
        // JavaScript to execute when fonts start loading
    },
    active: function() {
        // JavaScript to execute when fonts become active
    },
    inactive: function() {
        // JavaScript to execute when fonts become inactive (time out)
    }
};
WebFontConfig = {
    google: {
        families: [ 'Tangerine', 'Cantarell' ] // Google example
    },
    typekit: {
        id: 'myKitId' // Typekit example
    },
    loading: function() {
        // JavaScript to execute when fonts start loading
    },
    active: function() {
        // JavaScript to execute when fonts become active
    },
    inactive: function() {
        // JavaScript to execute when fonts become inactive (time out)
    }
};

WebFont 加载器还包括fontactive、 、的回调fontloadingfontinactive每次字体更新时都会触发该回调,让您可以在字体级别进行控制。有关更多信息,请查看 WebFont 加载器文档 ( https://developers.google.com/webfonts/docs/webfont_loader )。

The WebFont loader also includes callbacks for fontactive, fontloading, and fontinactive that is fired each time a font updates, giving you control at a font level. For more information, check out the WebFont Loader documentation (https://developers.google.com/webfonts/docs/webfont_loader).

Boot.getFont 简介:快速、小型的 Web 字体加载器

Introducing Boot.getFont: A Fast and Tiny Web Font Loader

我还没有看到那里的字体加载器,所以我编写了一个小字体加载器,它提供了相同的钩子来加载作为getFont我的引导库(https://github.com/artzstudio/Boot)的一部分调用的字体。

I haven’t seen one out there, so I wrote a little font loader that provides the same hooks for loading fonts called getFont as part of my Boot library (https://github.com/artzstudio/Boot).

GZIP 后的重量为 1.4 K(而 Google 为 6.4 KB,Typekit 为 8.3 KB),并且可以轻松放入您现有的库中。只需更改 "Boot"文件末尾的字符串即可更新命名空间(即jQuery)。

It weighs in at 1.4 K after GZIP (versus 6.4 KB Google, 8.3 KB Typekit) and easily fits into your existing library. Simply change the "Boot" string at the end of the file to update the namespace (i.e., jQuery).

字体通过 JavaScript 函数加载,并且可以提供在字体完成渲染后执行的回调。

Fonts are loaded via a JavaScript function, and a callback can be supplied that executes after the font has finished rendering.

Boot.getFont("opensans", function(){
    // JavaScript to execute when font is active.
});
Boot.getFont("opensans", function(){
    // JavaScript to execute when font is active.
});

Boot.getFont提供与 WebFont Loader 类似的 CSS 类,但在字体级别,提供精确的控制:

Boot.getFont provides similar CSS classes to the WebFont Loader but at a font level, affording precise control:

.wf-opensans-loading {
    /* Styles to apply while font is loading. */
}
.wf-opensans-active {
    /* Styles to apply when font is active. */
}
.wf-opensans-inactive {
    /* Styles to apply if font times out. */
}
.wf-opensans-loading {
    /* Styles to apply while font is loading. */
}
.wf-opensans-active {
    /* Styles to apply when font is active. */
}
.wf-opensans-inactive {
    /* Styles to apply if font times out. */
}

您可以通过加载配置对象轻松将其配置为根据目录结构获取字体:

You can easily configure it to grab fonts based on your directory structure by loading a configuration object:

// Global
Boot.getFont.option({
    path: "/fonts/{f}/{f}-webfont" // {f} is replaced with the font name
});

// Font-specific
Boot.getFont({ path: "http://mycdn.com/fonts/{f}/{f}-wf" }, "futura" );
// Global
Boot.getFont.option({
    path: "/fonts/{f}/{f}-webfont" // {f} is replaced with the font name
});

// Font-specific
Boot.getFont({ path: "http://mycdn.com/fonts/{f}/{f}-wf" }, "futura" );

我没有时间记录所有商品,但如果您有兴趣,可以在这里使用图书馆。

I haven’t had time to document all the goods, but the library is available here if you are interested.

Gentlefonts,启动你的引擎!

Gentlefonts, Start Your Engines!

现在您已经掌握了确保快速加载字体所需的知识,请看一下实施选项的性能。

Now that you’re armed with the knowledge needed to ensure fast-loading fonts, take a look at the performance of the implementation options.

我设置了以下测试页面,加载相同的网络字体(Open Sans),跨越 DIY 以及 Typekit 和 Google 的各种托管选项:

I set up the following test pages, loading the same web font (Open Sans), spanning DIY and various hosting options at Typekit and Google:

我使用http://webpagetest.org/并通过 1.5 mbps DSL 连接在 Chrome、Firefox 7、IE7、IE8 和 IE9 中加载每个测试页面 10 次。我们正在比较实施情况,因此我采取了最快的测试来消除网络延迟问题和数据差异的其他原因。

I used http://webpagetest.org/ and loaded each test page 10 times in Chrome, Firefox 7, IE7, IE8, and IE9 over a 1.5 mbps DSL connection. We are comparing implementation, so I took the fastest test to weed out network latency issues and other causes of variance in the data.

图 33-4显示了它们如何叠加,按跨浏览器的最快时间(毫秒)排名。

Figure 33-4 shows how they stack up, ranked by the fastest time (ms) across browsers.

按实现和浏览器划分的最快加载时间(毫秒)

图 33-4。按实现和浏览器划分的最快加载时间(毫秒)

Figure 33-4. Fastest Load Times (ms) by Implementation and Browser

花一些时间来消化数据。为了更好地比较跨浏览器的实现,请查看图 33-5 (IE9)、图 33-6 (IE8)、图 33-7 (IE7)、图 33-8 (Firefox) 和图 33-9 (铬合金)。

Take some time to digest the data. To better compare implementations across browsers, check out the charts on Figure 33-5 (IE9), Figure 33-6 (IE8), Figure 33-7 (IE7), Figure 33-8 (Firefox), and Figure 33-9 (Chrome).

字体实现基准:Internet Explorer 9

图 33-5。字体实现基准:Internet Explorer 9

Figure 33-5. Font Implementation Benchmarks: Internet Explorer 9

字体实现基准:Internet Explorer 8

图 33-6。字体实现基准:Internet Explorer 8

Figure 33-6. Font Implementation Benchmarks: Internet Explorer 8

字体实现基准:Internet Explorer 7

图 33-7。字体实现基准:Internet Explorer 7

Figure 33-7. Font Implementation Benchmarks: Internet Explorer 7

字体实现基准:Firefox

图 33-8。字体实现基准:Firefox

Figure 33-8. Font Implementation Benchmarks: Firefox

字体实现基准:Chrome

图 33-9。字体实现基准:Chrome

Figure 33-9. Font Implementation Benchmarks: Chrome

我的观察

My Observations

DIY 实施始终是最快的,尤其是与 CDN 结合使用时。这是由于物理原因——提供字体所需的字节、请求和 CPU 开销更少。

The Do-It-Yourself implementations were consistently the fastest, especially when combined with a CDN. This is due to physics—less bytes, requests, and CPU overhead are required to serve the font.

将 Google Web Fonts (GWF) 与 Typekit 进行比较很有趣,因为它们使用相同的核心加载器,但相似之处仅此而已(图 33-10图 33-11)。

It is interesting to compare Google Web Fonts (GWF) to Typekit since they use the same core loader, but that is where the similarities end (Figure 33-10, Figure 33-11).

Firefox 中的 Google Web 字体(1254 毫秒):JS→CSS→字体

图 33-10。Firefox 中的 Google Web 字体(1254 毫秒):JS CSS 字体

Figure 33-10. Google Web Fonts in Firefox (1254ms): JSCSSFont

Firefox 中的 Typekit(795 毫秒):JS » CSS 数据 URI

图 33-11。Firefox 中的 Typekit(795 毫秒):JS » CSS 数据 URI

Figure 33-11. Typekit in Firefox (795ms): JS » CSS Data URIs

在支持它们的浏览器中,Typekit 使用 CSS 中的数据 URI ( http://www.webpagetest.org/result/111231_2K_2PNEM/10/details/ ) 来加载字体,而 GWF 首先加载 JS,然后加载 CSS,然后最后是字体(http://www.webpagetest.org/result/111231_13_2PNDW/9/details/)。Typekit 在 IE 8 及更低版本 ( http://www.webpagetest.org/result/111231_QZ_2PNEG/4/details/ ) 中使用此方法,其中不支持数据 URI,最终导致这些浏览器中的加载时间变慢。谷歌也因为多次 DNS 查找而速度较慢;Typekit 正确地为所有资产使用一个域。

In browsers that support them, Typekit uses Data URIs in the CSS (http://www.webpagetest.org/result/111231_2K_2PNEM/10/details/) to load the font, whereas GWF first loads the JS, then the CSS, and finally the font (http://www.webpagetest.org/result/111231_13_2PNDW/9/details/). Typekit uses this approach in IE 8 and lower (http://www.webpagetest.org/result/111231_QZ_2PNEG/4/details/) where Data URIs are not supported, ending up with slower load times in those browsers. Google is also slower because of their multiple DNS lookups; Typekit rightly uses one domain for all assets.

Boot.getFont 的性能给我留下了深刻的印象,在所有情况下,它都比标准 @font-face CSS 更快(有时快一点,有时更快)。我的假设是,JS 以某种方式触发了回流/重绘,迫使字体在所有浏览器中更快下载。

I was impressed by the performance of Boot.getFont, which ended up being faster (sometimes by a hair, sometimes more) than the standard @font-face CSS in all cases. My hypothesis is that somehow the JS triggers a reflow/repaint that forces the fonts to download sooner in all browsers.

最后的想法

Final Thoughts

虽然本文可能会分成几篇文章,但我想要一个地方来记录实现选择、优化它们的技巧,并提供一些参考基准。如果其他字体提供商想要为我提供一个免费帐户(并托管 Open Sans,以保持一致性),我很乐意将它们纳入其他时间的另一项研究中。

While this article could probably be split into several, I wanted a single place to document implementation choices, tips for optimizing them, and have some reference benchmarks. If other font providers want to hook me up with a free account (and host Open Sans, for consistency), I’d be happy to include them in another study at another time.

我再次失望地看到谷歌提供了另一个(http://www.artzstudio.com/2011/06/googles-button-is-slow-and-so-is-facebooks/)缓慢的服务。Google 的朋友们,从 Typekit 中做一些笔记吧!

I was again dissappointed to see Google turn out another (http://www.artzstudio.com/2011/06/googles-button-is-slow-and-so-is-facebooks/) slow service. Google friends, take some notes from Typekit!

我期待听到您对此实验的想法和观察,以及您对加快网络字体速度的建议。谢谢阅读!

I am looking forward to hearing your thoughts and observations on this experiment, and to your recommendations for speeding up web fonts. Thanks for reading!

笔记

Note

要对本章发表评论,请访问http://www.artzstudio.com/2012/02/web-font-performance-weighing-fontface-options-and-alternatives/。最初发布于 2012 年 2 月 27 日。

To comment on this chapter, please visit http://www.artzstudio.com/2012/02/web-font-performance-weighing-fontface-options-and-alternatives/. Originally published on Feb 27, 2012.

关于作者

About the Author

Stoyan Stefanov(http://phpied.com,@stoyanstefanov)是一名 Facebook 工程师。此前曾在雅虎!他是 smush.it 在线图像优化工具的创建者和 YSlow 2.0 的架构师。性能工具。本书作者(JavaScript 模式、面向对象的 JavaScript)、贡献者(甚至更快的网站、高性能 JavaScript)和演讲者(Velocity、JSConf、Fronteers、Ajax Experience)。

Stoyan Stefanov (http://phpied.com, @stoyanstefanov) is a Facebook engineer. Previously at Yahoo! he was the creator of the smush.it online image optimization tool and architect of YSlow 2.0. performance tool. Book author (JavaScript Patterns, Object-Oriented JavaScript), contributor (Even Faster Web Sites, High-Performance JavaScript) and speaker (Velocity, JSConf, Fronteers, Ajax Experience).

版画

Colophon

《网络性能日记》第 2 卷封面上的动物是糖松鼠比亚克滑翔机。松鼠滑翔机(Petaurusnorfolcensis)是一种夜间滑翔的负鼠,不要与飞鼠混淆。北美飞鼠是胎盘哺乳动物,而松鼠滑翔机是有袋动物。

The animal on the cover of Web Performance Daybook Volume 2 is a Sugar Squirrel Biak Glider. The squirrel glider (Petaurus norfolcensis) is a nocturnal gliding possum, not to be confused with the flying squirrel. The flying squirrel of North America is a placental mammal, while the squirrel glider is a marsupial.

松鼠滑翔机原产于从南澳大利亚和维多利亚州边境到澳大利亚东南部(它们栖息在干燥的森林和林地)到昆士兰州北部(它们栖息在潮湿的桉树林)的范围内。这些腕翼滑翔机在挖空的树上安家,在巢穴里铺上树叶。通常,它们生活在由一只雄性、两只雌性和后代组成的群体中。

Squirrel gliders are native to the range from the South Australian and Victorian Border through southeast Australia, where they inhabit dry forest and woodlands, to northern Queensland, where they inhabit a wetter eucalypt forest. These wrist-winged gliders make their home in hollowed out trees, lining their dens with leaves. Typically, they live in groups of one male, two females, and offspring.

松鼠滑翔机的食物主要包括昆虫和水果,其次是桉树和红血木品种的树液、花粉、花蜜、树叶和树皮。松鼠滑翔机有与环尾负鼠相当的浓密尾巴,并用它作为额外的肢体来缠绕树枝以抓住。

The squirrel glider’s diet consists predominantly of insects and fruit, followed up by tree sap of the Eucalypt and Red Bloodwood variety, pollen, nectar, leaves, and bark. Squirrel gliders have bushy tails comparable to the ring tail possum, and use it as an extra limb to wrap around branches to hold on.

封面字体为 Adob​​e ITC Garamond。文字字体为Linotype Birka;标题字体为 Adob​​e Myriad Condensed;代码字体是LucasFont的TheSansMonoCondensed。

The cover font is Adobe ITC Garamond. The text font is Linotype Birka; the heading font is Adobe Myriad Condensed; and the code font is LucasFont’s TheSansMonoCondensed.

Web 性能日志,第 2 卷

Web Performance Daybook, Volume 2

斯托扬·斯特凡诺夫

Stoyan Stefanov

编辑

Editor

玛丽·特雷塞勒

Mary Treseler

修订记录
2012-06-15首次发布

购买 O'Reilly 书籍可用于教育、商业或促销用途。大多数图书也提供在线版本 ( http://my.safaribooksonline.com )。欲了解更多信息,请联系我们的企业/机构销售部门:800-998-9938或

O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://my.safaribooksonline.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or .

Nutshell Handbook、Nutshell Handbook 徽标和 O'Reilly 徽标是 O'Reilly Media, Inc. 的注册商标。Web Performance Daybook Volume 2、糖松鼠 biak 滑翔机的封面图片以及相关商业外观是 O'Reilly Media, Inc. 的商标赖利媒体公司

Nutshell Handbook, the Nutshell Handbook logo, and the O’Reilly logo are registered trademarks of O’Reilly Media, Inc. Web Performance Daybook Volume 2, the cover image of a sugar squirrel biak glider, and related trade dress are trademarks of O’Reilly Media, Inc.

制造商和销售商用来区分其产品的许多名称都被称为商标。如果这些名称出现在本书中,并且 O'Reilly Media, Inc. 知道商标声明,则这些名称均以大写字母或首字母大写字母印刷。

Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and O’Reilly Media, Inc., was aware of a trademark claim, the designations have been printed in caps or initial caps.

尽管在编写本书时已采取一切预防措施,但出版商和作者对错误或遗漏或因使用本文所含信息而造成的损害不承担任何责任。

While every precaution has been taken in the preparation of this book, the publisher and authors assume no responsibility for errors or omissions, or for damages resulting from the use of the information contained herein.

奥莱利媒体

格拉文斯坦公路北1005号

1005 Gravenstein Highway North

塞瓦斯托波尔, CA 95472

Sebastopol, CA 95472

2012-06-18T08:05:49-07:00

2012-06-18T08:05:49-07:00